How to close all grpc server streams using gracefulStop? - go

I'm trying to stop all clients connected to a stream server from server side.
Actually I'm using GracefulStop method to handle it gracefully.
I am waiting for os.Interrupt signal on a channel to perform a graceful stop for gRPC. but it gets stuck on server.GracefulStop() when the client is connected.
func (s *Service) Subscribe(_ *empty.Empty, srv clientapi.ClientApi_SubscribeServer) error {
ctx := srv.Context()
updateCh := make(chan *clientapi.Update, 100)
stopCh := make(chan bool)
defer func() {
stopCh<-true
close(updateCh)
}
go func() {
ticker := time.NewTicker(1 * time.Second)
defer func() {
ticker.Stop()
close(stopCh)
}
for {
select {
case <-stopCh:
return
case <-ticker.C:
updateCh<- &clientapi.Update{Name: "notification": Payload: "sample notification every 1 second"}
}
}
}()
for {
select {
case <-ctx.Done():
return ctx.Err()
case notif := <-updateCh:
err := srv.Send(notif)
if err == io.EOF {
return nil
}
if err != nil {
s.logger.Named("Subscribe").Error("error", zap.Error(err))
continue
}
}
}
}
I expected the context in method ctx.Done() could handle it and break the for loop.
How to close all response streams like this one?

Create a global context for your gRPC service. So walking through the various pieces:
Each gRPC service request would use this context (along with the client context) to fulfill that request
os.Interrupt handler would cancel the global context; thus canceling any currently running requests
finally issue server.GracefulStop() - which should wait for all the active gRPC calls to finish up (if they haven't see the cancelation immediately)
So for example, when setting up the gRPC service:
pctx := context.Background()
globalCtx, globalCancel := context.WithCancel(pctx)
mysrv := MyService{
gctx: globalCtx
}
s := grpc.NewServer()
pb.RegisterMyService(s, mysrv)
os.Interrupt handler initiates and waits for shutdown:
globalCancel()
server.GracefulStop()
gRPC methods:
func(s *MyService) SomeRpcMethod(ctx context.Context, req *pb.Request) error {
// merge client and server contexts into one `mctx`
// (client context will cancel if client disconnects)
// (server context will cancel if service Ctrl-C'ed)
mctx, mcancel := mergeContext(ctx, s.gctx)
defer mcancel() // so we don't leak, if neither client or server context cancels
// RPC WORK GOES HERE
// RPC WORK GOES HERE
// RPC WORK GOES HERE
// pass mctx to any blocking calls:
// - http REST calls
// - SQL queries etc.
// - or if running a long loop; status check the context occasionally like so:
// Example long request (10s)
for i:=0; i<10*1000; i++ {
time.Sleep(1*time.Milliscond)
// poll merged context
select {
case <-mctx.Done():
return fmt.Errorf("request canceled: %s", mctx.Err())
default:
}
}
}
And:
func mergeContext(a, b context.Context) (context.Context, context.CancelFunc) {
mctx, mcancel := context.WithCancel(a) // will cancel if `a` cancels
go func() {
select {
case <-mctx.Done(): // don't leak go-routine on clean gRPC run
case <-b.Done():
mcancel() // b canceled, so cancel mctx
}
}()
return mctx, mcancel
}

Typically clients need to assume that RPCs can terminate (e.g. due to connection errors or server power failure) at any moment. So what we do is GracefulStop, sleep for a short time period to allow in-flight RPCs an opportunity to complete naturally, then hard-Stop the server. If you do need to use this termination signal to end your RPCs, then the answer by #colminator is probably the best choice. But this situation should be unusual, and you may want to spend some time analyzing your design if you do find it is necessary to manually end streaming RPCs at server shutdown.

Related

Hold subscribe method with custom handler nats golang

I am writing wrapper on top of nats client in golang, I want to take handler function which can be invoked from consumer once I get the message from nats server.
I want to hold custom subscribe method until it receives the message from nats.
Publish:
func (busConfig BusConfig) Publish(service string, data []byte) error {
pubErr := conn.Publish(service, data)
if pubErr != nil {
return pubErr
}
return nil
}
Subscribe:
func (busConfig BusConfig) Subscribe(subject string, handler func(msg []byte)) {
fmt.Println("Subscrbing on : ", subject)
//wg := sync.WaitGroup{}
//wg.Add(1)
subscription, err := conn.Subscribe(subject, func(msg *nats.Msg) {
go func() {
handler(msg.Data)
}()
//wg.Done()
})
if err != nil {
fmt.Println("Subscriber error : ", err)
}
//wg.Wait()
defer subscription.Unsubscribe()
}
test case:
func TestLifeCycleEvent(t *testing.T) {
busClient := GetBusClient()
busClient.Subscribe(SUBJECT, func(input []byte) {
fmt.Println("Life cycle event received :", string(input))
})
busClient.Publish(SUBJECT, []byte("complete notification"))
}
I am seeing message is published but not subscribed, I tried to hold subscribe method using waitgroup but I think this is not the correct solution.
You don't see the message being delivered because Subscribe is an async method that spawns a goroutine to handle the incoming messages and call the callback.
Straight after calling busClient.Publish() your application exits. It does not wait for anything to happen inside Subscribe().
When you use nats.Subscribe(), you usually have a long-running application that exits in specific conditions (like receiving a shutdown signal). WaitGroup can work here, but probably not for real applications, just for tests.
You should also call Flush() method on NATS connection to ensure all buffered messages have been sent before exiting the program.
If you want a synchronous method, you can use nats.SubscribeSync()
Check out examples here: https://natsbyexample.com/examples/messaging/pub-sub/go
For my understanding, I think in NATs we need to respond to the message even if we are not providing the reply address, so it can respond to the message.
func (busConfig BusConfig) Subscribe(subject string, handler func(msg []byte)) {
subscription, err := conn.Subscribe(subject, func(msg *nats.Msg) {
go func() {
handler(msg.Data)
msg.Respond(nil)
}()
})
}

Fire and forget goroutine golang

I have written an API that makes DB calls and does some business logic. I am invoking a goroutine that must perform some operation in the background.
Since the API call should not wait for this background task to finish, I am returning 200 OK immediately after calling the goroutine (let us assume the background task will never give any error.)
I read that goroutine will be terminated once the goroutine has completed its task.
Is this fire and forget way safe from a goroutine leak?
Are goroutines terminated and cleaned up once they perform the job?
func DefaultHandler(w http.ResponseWriter, r *http.Request) {
// Some DB calls
// Some business logics
go func() {
// some Task taking 5 sec
}()
w.WriteHeader(http.StatusOK)
}
I would recommend always having your goroutines under control to avoid memory and system exhaustion.
If you are receiving a spike of requests and you start spawning goroutines without control, probably the system will go down soon or later.
In those cases where you need to return an immediate 200Ok the best approach is to create a message queue, so the server only needs to create a job in the queue and return the ok and forget. The rest will be handled by a consumer asynchronously.
Producer (HTTP server) >>> Queue >>> Consumer
Normally, the queue is an external resource (RabbitMQ, AWS SQS...) but for teaching purposes, you can achieve the same effect using a channel as a message queue.
In the example you'll see how we create a channel to communicate 2 processes.
Then we start the worker process that will read from the channel and later the server with a handler that will write to the channel.
Try to play with the buffer size and job time while sending curl requests.
package main
import (
"fmt"
"log"
"net/http"
"time"
)
/*
$ go run .
curl "http://localhost:8080?user_id=1"
curl "http://localhost:8080?user_id=2"
curl "http://localhost:8080?user_id=3"
curl "http://localhost:8080?user_id=....."
*/
func main() {
queueSize := 10
// This is our queue, a channel to communicate processes. Queue size is the number of items that can be stored in the channel
myJobQueue := make(chan string, queueSize) // Search for 'buffered channels'
// Starts a worker that will read continuously from our queue
go myBackgroundWorker(myJobQueue)
// We start our server with a handler that is receiving the queue to write to it
if err := http.ListenAndServe("localhost:8080", myAsyncHandler(myJobQueue)); err != nil {
panic(err)
}
}
func myAsyncHandler(myJobQueue chan<- string) http.HandlerFunc {
return func(rw http.ResponseWriter, r *http.Request) {
// We check that in the query string we have a 'user_id' query param
if userID := r.URL.Query().Get("user_id"); userID != "" {
select {
case myJobQueue <- userID: // We try to put the item into the queue ...
rw.WriteHeader(http.StatusOK)
rw.Write([]byte(fmt.Sprintf("queuing user process: %s", userID)))
default: // If we cannot write to the queue it's because is full!
rw.WriteHeader(http.StatusInternalServerError)
rw.Write([]byte(`our internal queue is full, try it later`))
}
return
}
rw.WriteHeader(http.StatusBadRequest)
rw.Write([]byte(`missing 'user_id' in query params`))
}
}
func myBackgroundWorker(myJobQueue <-chan string) {
const (
jobDuration = 10 * time.Second // simulation of a heavy background process
)
// We continuosly read from our queue and process the queue 1 by 1.
// In this loop we could spawn more goroutines in a controlled way to paralelize work and increase the read throughput, but i don't want to overcomplicate the example.
for userID := range myJobQueue {
// rate limiter here ...
// go func(u string){
log.Printf("processing user: %s, started", userID)
time.Sleep(jobDuration)
log.Printf("processing user: %s, finisehd", userID)
// }(userID)
}
}
There is no "goroutine cleaning" you have to handle, you just launch goroutines and they'll be cleaned when the function launched as a goroutine returns. Quoting from Spec: Go statements:
When the function terminates, its goroutine also terminates. If the function has any return values, they are discarded when the function completes.
So what you do is fine. Note however that your launched goroutine cannot use or assume anything about the request (r) and response writer (w), you may only use them before you return from the handler.
Also note that you don't have to write http.StatusOK, if you return from the handler without writing anything, that's assumed to be a success and HTTP 200 OK will be sent back automatically.
See related / possible duplicate: Webhook process run on another goroutine
#icza is absolutely right there is no "goroutine cleaning" you can use a webhook or a background job like gocraft. The only way I can think of using your solution is to use the sync package for learning purposes.
func DefaultHandler(w http.ResponseWriter, r *http.Request) {
// Some DB calls
// Some business logics
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
// some Task taking 5 sec
}()
w.WriteHeader(http.StatusOK)
wg.wait()
}
you can wait for a goroutine to finish using &sync.WaitGroup:
// BusyTask
func BusyTask(t interface{}) error {
var wg = &sync.WaitGroup{}
wg.Add(1)
go func() {
// busy doing stuff
time.Sleep(5 * time.Second)
wg.Done()
}()
wg.Wait() // wait for goroutine
return nil
}
// this will wait 5 second till goroutune finish
func main() {
fmt.Println("hello")
BusyTask("some task...")
fmt.Println("done")
}
Other way is to attach a context.Context to goroutine and time it out.
//
func BusyTaskContext(ctx context.Context, t string) error {
done := make(chan struct{}, 1)
//
go func() {
// time sleep 5 second
time.Sleep(5 * time.Second)
// do tasks and signle done
done <- struct{}{}
close(done)
}()
//
select {
case <-ctx.Done():
return errors.New("timeout")
case <-done:
return nil
}
}
//
func main() {
fmt.Println("hello")
ctx, cancel := context.WithTimeout(context.TODO(), 2*time.Second)
defer cancel()
if err := BusyTaskContext(ctx, "some task..."); err != nil {
fmt.Println(err)
return
}
fmt.Println("done")
}

How to release a websocket and redis gateway server resource in golang?

I have a gateway server, which can push message to client side by using websocket, A new client connected to my server, I will generate a cid for it. And then I also subscribe a channel, which using cid. If any message publish to this channel, My server will push it to client side. For now, all unit are working fine, but when I try to test with benchmark test by thor, it will crash, I fine the DeliverMessage has some issue, it would never exit, since it has a die-loop. but since redis need to subscribe something, I don't know how to avoid loop.
func (h *Hub) DeliverMessage(pool *redis.Pool) {
conn := pool.Get()
defer conn.Close()
var gPubSubConn *redis.PubSubConn
gPubSubConn = &redis.PubSubConn{Conn: conn}
defer gPubSubConn.Close()
for {
switch v := gPubSubConn.Receive().(type) {
case redis.Message:
// fmt.Printf("Channel=%q | Data=%s\n", v.Channel, string(v.Data))
h.Push(string(v.Data))
case redis.Subscription:
fmt.Printf("Subscription message: %s : %s %d\n", v.Channel, v.Kind, v.Count)
case error:
fmt.Println("Error pub/sub, delivery has stopped", v)
panic("Error pub/sub")
}
}
}
In the main function, I have call the above function as:
go h.DeliverMessage(pool)
But when I test it with huge connection, it get me some error like:
ERR max number of clients reached
So, I change the redis pool size by change MaxIdle:
func newPool(addr string) *redis.Pool {
return &redis.Pool{
MaxIdle: 5000,
IdleTimeout: 240 * time.Second,
Dial: func() (redis.Conn, error) { return redis.Dial("tcp", addr) },
}
}
But it still doesn't work, so I wonder to know, if there any good way to kill a goroutine after my websocket disconnected to my server on the below selection:
case client := <-h.Unregister:
if _, ok := h.Clients[client]; ok {
delete(h.Clients, client)
delete(h.Connections, client.CID)
close(client.Send)
if err := gPubSubConn.Unsubscribe(client.CID); err != nil {
panic(err)
}
// TODO kill subscribe goroutine if don't client-side disconnected ...
}
But How do I identify this goroutine? How can I do it like unix way. kill -9 <PID>?
Look at the example here
You can make your goroutine exit by having a return statement inside your switch case in your DeliverMessage, once you're not receiving anything more. I'm guessing case error, or as seen in the example, case 0 you'd want to return from that, and your goroutine will cancel. Or if I'm misunderstanding things, and case client := <-h.Unregister: is inside the DeliverMessage, just return.
You're also closing your connection twice. defer gPubSubConn.Close() simply calls conn.Close() so you don't need defer conn.Close()
Also take a look at the Pool and look at what all the parameters actually do. If you want to handle many connections, set MaxActive to 0 "When zero, there is no limit on the number of connections in the pool." (and do you actually want the idle timeout?)
Actually, I got wrong design architecture, I am going to explain what I want to do.
A client can connect to my websocket server;
The server have several handler of http, and the admin can post data via the handler, the structure of the data can be like:
{
"cid": "something",
"body": {
}
}
Since, I have several Nodes are running to service our client, and the Nginx can dispatch each request from admin to totally different Node, but only one Node has hold on the connection about cid with "something", so I will need to publish this data to Redis, if any Node has got the data, it's going to send this message to the client side.
3.Looking for the NodeID, which i am going to Publish to by given an cid.
// redis code & golang
NodeID, err := conn.Do("HGET", "NODE_MAP", cid)
4.For now, I can publish any message from the admin, and publish to the NodeID, which we have got at step 3.
// redis code & golang
NodeID, err := conn.Do("PUBLISH", NodeID, data)
Time to show the core code, which related to this question. I am going to subscribe a channel, which name is NodeID. like the following.
go func(){
for {
switch v := gPubSubConn.Receive().(type) {
case redis.Message:
fmt.Println("Got a message", v.Data)
h.Broadcast <- v.Data
pipeline <- v.Data
case error:
panic(v)
}
}
}()
6.To manage your websocket, you do also need a goroutine to do that. like the following way:
go func () {
for {
select {
case client := <-h.Register:
h.Clients[client] = true
cid := client.CID
h.Connections[cid] = client
body := "something"
client.Send <- msg // greeting
case client := <-h.Unregister:
if _, ok := h.Clients[client]; ok {
delete(h.Clients, client)
delete(h.Connections, client.CID)
close(client.Send)
}
case message := <-h.Broadcast:
fmt.Println("message is", message)
}
}
}()
The last thing is manage a redis pool, you don't really need a connection pool right now. since we only have two goroutine, one main process.
func newPool(addr string) *redis.Pool {
return &redis.Pool{
MaxIdle: 100,
IdleTimeout: 240 * time.Second,
Dial: func() (redis.Conn, error) { return redis.Dial("tcp", addr) },
}
}
var (
pool *redis.Pool
redisServer = flag.String("redisServer", ":6379", "")
)
pool = newPool(*redisServer)
conn := pool.Get()
defer conn.Close()

Graceful shutdown of gRPC downstream

Using the following proto buffer code :
syntax = "proto3";
package pb;
message SimpleRequest {
int64 number = 1;
}
message SimpleResponse {
int64 doubled = 1;
}
// All the calls in this serivce preform the action of doubling a number.
// The streams will continuously send the next double, eg. 1, 2, 4, 8, 16.
service Test {
// This RPC streams from the server only.
rpc Downstream(SimpleRequest) returns (stream SimpleResponse);
}
I'm able to successfully open a stream, and continuously get the next doubled number from the server.
My go code for running this looks like :
ctxDownstream, cancel := context.WithCancel(ctx)
downstream, err := testClient.Downstream(ctxDownstream, &pb.SimpleRequest{Number: 1})
for {
responseDownstream, err := downstream.Recv()
if err != io.EOF {
println(fmt.Sprintf("downstream response: %d, error: %v", responseDownstream.Doubled, err))
if responseDownstream.Doubled >= 32 {
break
}
}
}
cancel() // !!This is not a graceful shutdown
println(fmt.Sprintf("%v", downstream.Trailer()))
The problem I'm having is using a context cancellation means my downstream.Trailer() response is empty. Is there a way to gracefully close this connection from the client side and receive downstream.Trailer().
Note: if I close the downstream connection from the server side, my trailers are populated. But I have no way of instructing my server side to close this particular stream. So there must be a way to gracefully close a stream client side.
Thanks.
As requested some server code :
func (b *binding) Downstream(req *pb.SimpleRequest, stream pb.Test_DownstreamServer) error {
request := req
r := make(chan *pb.SimpleResponse)
e := make(chan error)
ticker := time.NewTicker(200 * time.Millisecond)
defer func() { ticker.Stop(); close(r); close(e) }()
go func() {
defer func() { recover() }()
for {
select {
case <-ticker.C:
response, err := b.Endpoint(stream.Context(), request)
if err != nil {
e <- err
}
r <- response
}
}
}()
for {
select {
case err := <-e:
return err
case response := <-r:
if err := stream.Send(response); err != nil {
return err
}
request.Number = response.Doubled
case <-stream.Context().Done():
return nil
}
}
}
You will still need to populate the trailer with some information. I use the grpc.StreamServerInterceptor to do this.
According to the grpc go documentation
Trailer returns the trailer metadata from the server, if there is any.
It must only be called after stream.CloseAndRecv has returned, or
stream.Recv has returned a non-nil error (including io.EOF).
So if you want to read the trailer in client try something like this
ctxDownstream, cancel := context.WithCancel(ctx)
defer cancel()
for {
...
// on error or EOF
break;
}
println(fmt.Sprintf("%v", downstream.Trailer()))
Break from the infinate loop when there is a error and print the trailer. cancel will be called at the end of the function as it is deferred.
I can't find a reference that explains it clearly, but this doesn't appear to be possible.
On the wire, grpc-status is followed by the trailer metadata when the call completes normally (i.e. the server exits the call).
When the client cancels the call, neither of these are sent.
Seems that gRPC treats call cancellation as a quick abort of the rpc, not much different than the socket being dropped.
Adding a "cancel message" via request streaming works; the server can pick this up and cancel the stream from its end and trailers will still get sent:
message SimpleRequest {
oneof RequestType {
int64 number = 1;
bool cancel = 2;
}
}
....
rpc Downstream(stream SimpleRequest) returns (stream SimpleResponse);
Although this does add a bit of complication to the code.

grpc go : how to know in server side, when client closes the connection

I am using grpc go
i have an rpc which looks roughly like this
196 service MyService {
197 // Operation 1
198 rpc Operation1(OperationRequest) returns (OperationResponse) {
199 option (google.api.http) = {
200 post: "/apiver/myser/oper1"
201 body: "*"
202 };
203 }
Client connects by using grpc.Dial() method
When a client connects, the server does some book keeping. when the client disconnects, the bookkeeping needs to be removed.
is there any callback that can be registered which can be used to know that client has closed the session.
Based on your code, it's an unary rpc call, the client connect to server for only one time, send a request and get a response. The client will wait for the response until timeout.
In server side streaming, you can get the client disconnect from
<-grpc.ServerStream.Context.Done()
signal.
With that above, you can implement your own channel in a go routine to build your logic. Use select statement as:
select {
case <-srv.Context().Done():
return
case res := <-<YOUR OWN CHANNEL, WITH RECEIVED RESQUEST OR YOUR RESPONSE>
....
}
I provide some detailed code here
In client streaming, besides the above signal, you can check whether the server can receive the msg:
req, err := grpc.ServerStream.Recv()
if err == io.EOF {
break
} else if err != nil {
return err
}
Assuming that the server is implemented in go, there's an API on the *grpc.ClientConn that reports state changes in the connection.
func (cc *ClientConn) WaitForStateChange(ctx context.Context, sourceState connectivity.State) bool
https://godoc.org/google.golang.org/grpc#ClientConn.WaitForStateChange
These are the docs on each of the connectivity.State
https://github.com/grpc/grpc/blob/master/doc/connectivity-semantics-and-api.md
If you need to expose a channel that you can listen to for the client closing the connection then you could do something like this:
func connectionOnState(ctx context.Context, conn *grpc.ClientConn, states ...connectivity.State) <-chan struct{} {
done := make(chan struct{})
go func() {
// any return from this func will close the channel
defer close(done)
// continue checking for state change
// until one of break states is found
for {
change := conn.WaitForStateChange(ctx, conn.GetState())
if !change {
// ctx is done, return
// something upstream is cancelling
return
}
currentState := conn.GetState()
for _, s := range states {
if currentState == s {
// matches one of the states passed
// return, closing the done channel
return
}
}
}
}()
return done
}
If you only want to consider connections that are shutting down or shutdown, then you could call it like so:
// any receives from shutdownCh will mean the state Shutdown
shutdownCh := connectionOnState(ctx, conn, connectivity.Shutdown)
as the github issue:link
you can do like this
err = stream.Context().Err()
if err != nil {
break
}

Resources