How to create server for persistent stream (aka pubsub) in Golang GRPC - go

I am building service that needs to send events to all subscribed consumers in Pub/Sub manner eg. send one event to all currently connected clients.
I am using Protobuf for that with the following proto definition:
service EventsService {
rpc ListenForEvents (AgentProcess) returns (stream Event) {}
}
Both server & client are written in Go.
My problem is that when client initiates connection then the stream it is not long-lived, eg. when server returns from ListenForEvents method:
func (e EventsService) ListenForEvents(process *pb.AgentProcess, listener pb.EventsService_ListenForEventsServer) error {
//persist listener here so it can be used later when backend needs to send some messages to client
return nil
}
then the client almost instantly gets EOF error which means that server probably closed connection.
What do I do so that the client is subscribed for a long time to the server? The main problem is that I might not have anything to send to the client when it calls ListenForEvents method on the server, this is why I want this stream to be long lived to be able to send messages later.

The stream terminates when you return from the server function. Instead, you should receive events somehow, and send them to the client without returning from your server. There are probably many ways you can do this. Below is the sketch of one way of doing it.
This relies on the server connection running on a separate goroutine. There is a Broadcast() function that will send messages to all connected clients. It looks like this:
var allRegisteredClients map[*pb.AgentProcess]chan Message
var clientsLock sync.RWMutex{}
func Broadcast(msg Message) {
clientsLock.RLock()
for _,x:=range allRegisteredClients {
x<-msg
}
clientsLock.RUnlock()
}
Then, your clients have to register themselves, and process messages:
func (e EventsService) ListenForEvents(process *pb.AgentProcess, listener pb.EventsService_ListenForEventsServer) error {
clientsLock.Lock()
ch:=make(chan Message)
allRegisteredClients[process]=ch
clientsLock.Unlock()
for msg:=range ch {
// send message
// Deal with errors
// Deal with client terminations
}
clientsLock.Lock()
delete(allRegisteredClients,process)
clientsLock.Unlock()
}
As I said, this is only a sketch of the idea.

I have managed to nail it down.
Basically I never return from method ListenForEvents.
It creates channel, persists in global-like map of subscribed clients and keeps reading from that channel indefinitely.
The whole implementation of server logic:
func (e EventsService) ListenForEvents(process *pb.AgentProcess, listener pb.EventsService_ListenForEventsServer) error {
chans, exists := e.listeners[process.Hostname]
chanForThisClient := make(chan *pb.Event)
if !exists {
e.listeners[process.Hostname] = []chan *pb.Event{chanForThisClient}
} else {
e.listeners[process.Hostname] = append(chans, chanForThisClient)
}
for {
select {
case <-listener.Context().Done():
return nil
case res := <-chanForThisClient:
_ = listener.Send(res)
}
}
return nil
}

You need to provide keepalive settings for grpc client and server
See details here https://github.com/grpc/grpc/blob/master/doc/keepalive.md
Examples https://github.com/grpc/grpc-go/tree/master/examples/features/keepalive

Related

Message still in nats limit queue after ack and term sent in Go

I tried writing a subscriber for a NATS limit queue:
sub, err := js.SubscribeSync(fullSubject, nats.Context(ctx))
if err != nil {
return err
}
msg, err := sub.NextMsgWithContext(ctx)
if err != nil {
if errors.Is(err, nats.ErrSlowConsumer) {
log.Printf("Slow consumer error returned. Waiting for reset...")
time.Sleep(50 * time.Millisecond)
continue
} else {
return err
}
}
msg.InProgress()
var message pnats.NatsMessage
if err := conn.unmarshaller(msg.Data, &message); err != nil {
msg.Term()
return err
}
actualSubject := message.Context.FullSubject()
handler, ok := callbacks[message.Context.Category]
if !ok {
msg.Nak()
continue
}
callback, err := handler(&message)
if err == nil {
msg.Ack()
msg.Term()
} else {
msg.Nak()
return err
}
callback(ctx)
The goal of this code is consume any message on a number of subjects and call a callback function associated with the subject. This code works but the issue I'm running into is that I'd like the message to be deleted after the call to handler if that function doesn't return an error. I thought that's what msg.Term was doing but I still see all the messages in the queue.
I had originally designed this around a work queue but I wanted it to work with multiple subscribers so I had to redesign it. Is there any way to make this work?
Based on the code provided, I assume that you are not providing stream and consumer info when creating a subscription with the JetStream library.
In the documentation for the SubscribeSync method, it says that when stream and consumer information is not provided, the library will create an ephemeral consumer and the name of the consumer is picked by the server. It also attempts to figure out which stream the subscription is for.
Here is what I believe happens in your code:
When you call the SubscribeSync method, an ephemeral consumer is created, with your provided topic.
When msg.Ack and msg.Term are called, you do acknowledge the message, but only for that current consumer.
The next time you call the SubscribeSync method, a new ephemeral consumer is created, containing the message that you already deleted on another consumer. Which is how the Jetstream concepts of streams, consumers, and subscriptions work by design.
Based on what you want to accomplish, here are some suggestions:
Use the plain NATS Core library to work with either a pub/sub or a queue. Don't use JetStream. The NATS Core library works with topics directly, whereas the Jetstream library creates additional things (streams and consumers) under the hood if the information is not provided.
Use JetStream but create a stream and a durable consumer yourself, either through code or directly on the NATS server. This way, with a stream and a consumer already defined, you should be able to make it work as intended.

Race conditions in client synchronization

I have a web app whose server creates a Client for each websocket connection. A Client acts as an intermediary between the websocket connection and a single instance of a Hub. The Hub maintains a set of registered clients and broadcasts messages to the clients. This works pretty well but the problem is that a client might miss events between when the server generates the initial state bundle that the client receives on connection and when the client is registered with the hub and starts receiving broadcast events.
My idea is to register the client with the hub before any information is fetched from the db. That would ensure that the client doesn't miss any broadcasts, though now it could receive messages that are already applied to the initial state it receives. To allow the client to disregard these messages I could include a monotonic timestamp in both the initial state bundle as well as broadcast events.
Can you think of a more elegant/simpler solution?
I have used a write-ahead-log in the past to do something like this. In short, keep a ring buffer of messages in the hub. Then replay messages that where send to existing clients while the new one was initialized.
You can expose this concept to the clients too if you wish. That way you can implement efficient re-connects (particularly nice for mobile connections). When clients loose the websocket connection they can reconnect and say "Hey there, it's me again. Looks like we got interrupted. The last message I've seen was number 42. What's new?"
The following is from memory, so take this only as an illustration of the idea, not a finished implementation. In the intererest of brevity I've omited the select statements around client.send, for instance.
package main
import (
"container/list"
"sync"
"github.com/gorilla/websocket"
)
type Client struct { // all unchanged
hub *Hub
conn *websocket.Conn
send chan []byte
}
type Hub struct {
mu *sync.RWMutex
wal list.List // List if recent messages
clients map[*Client]bool // Registered clients.
register chan Registration // not a chan *Client anymore
broadcast chan []byte
unregister chan *Client
}
type Registration struct {
client *Client
// init is a function that is executed before the client starts to receive
// broadcast messages. All messages that are broadcast while init is
// running will be sent after init returns.
init func()
}
func (h *Hub) run() {
for {
select {
case reg := <-h.register:
// Take note of the most recent message as of right now.
// initClient will replay all later messages
h.mu.RLock()
head := h.wal.Back()
h.mu.RUnlock()
go h.initClient(reg, head)
case client := <-h.unregister:
h.mu.Lock()
if _, ok := h.clients[client]; ok {
delete(h.clients, client)
close(client.send)
}
h.mu.Unlock()
case message := <-h.broadcast:
h.mu.Lock()
h.wal.PushBack(message)
// TODO: Trim list if too long by some metric (e.g. number of
// messages, age, total message size, etc.)
clients := make([]*Client, len(h.clients))
copy(clients, h.clients)
h.mu.Unlock()
for client := range clients {
// TODO: deal with backpressure
client.send <- message
}
}
}
}
func (h *Hub) initClient(reg Registration, head *list.Element) {
reg.init()
// send messages in h.wal after head
for {
h.mu.RLock()
head = head.Next()
if head == nil {
// caught up
h.clients[reg.client] = true
h.mu.RUnlock()
return
}
h.mu.RUnlock()
// TODO: deal with backpressure
reg.client.send <- head.Value.([]byte)
}
}

CloseHandler is not called for a gorilla/websocket if I am not reading messages anywhere, I simply get a write error eventually

I have a websocket server using gorilla/websocket.
I have a situation where I am simply writing messages to a set of websockets. My custom CloseHandler is never called when I close the websocket on the browser side.
However, adding a goroutine that calls ReadMessage indefinitely (till some error) leads to the CloseHandler being invoked.
Here's the basic idea:
In one goroutine, I run something like this:
for {
for client := range clients {
client.stream <- data
}
time.Sleep(time.Second)
}
and the other code, called in a separate goroutine, one per client:
go (func() {
// If I call wsock.ReadMessage here, my CloseHandler works!
})()
for msg := range myclient.stream {
if err := wsock.WriteMessage(websocket.TextMessage, msg); err != nil {
break
}
}
When I close the websocket on the browser side, I expect the CloseHandler to be called, however, it's never called, instead, I eventually get an error on the WriteMessage call.
The close handler is called when a close message is received from the peer. The application must read the connection to receive close and other control messages.
If the application does not read the connection or the peer does not send a close message, then the close handler will not be called.
If your goal is to detect closed connections, then read the connection until an error as returned as shown in the documentation:
func readLoop(c *websocket.Conn) {
for {
if _, _, err := c.NextReader(); err != nil {
c.Close()
break
}
}
}
The application should only set a close handler when the application must perform some action before the connection bounces the close message back to the peer.

Is there anyway to close client request in golang/gin?

Using gin framework.
Is there anyway to notify client to close request connection, then server handler can do any back-ground jobs without letting the clients to wait on the connection?
func Test(c *gin.Context) {
c.String(200, "ok")
// close client request, then do some jobs, for example sync data with remote server.
//
}
Yes, you can do that. By simply returning from the handler. And the background job you want to do, you should put that on a new goroutine.
Note that the connection and/or request may be put back into a pool, but that is irrelevant, the client will see that serving the request ended. You achieve what you want.
Something like this:
func Test(c *gin.Context) {
c.String(200, "ok")
// By returning from this function, response will be sent to the client
// and the connection to the client will be closed
// Started goroutine will live on, of course:
go func() {
// This function will continue to execute...
}()
}
Also see: Goroutine execution inside an http handler

MGO and long running Web Services - recovery

I've written a REST web service that uses mongo as the backend data store. I was wondering at this stage (before deployment), what the best practices were, considering a service that essentially runs forever(ish).
Currently, I'm following this type of pattern:
// database.go
...
type DataStore struct {
mongoSession *mgo.Session
}
...
func (d *DataStore) OpenSession () {
... // read setup from environment
mongoSession, err = mgo.Dial(mongoURI)
if err != nil {}
...
}
func (d *DataStore) CloseSession() {...}
func (d *DataStore) Find (...) (results...) {
s := d.mongoSession.Copy()
defer s.Close()
// do stuff, return results
}
In main.go:
func main() {
ds := NewDataStore()
ds.OpenSession()
defer ds.CloseSession()
// Web Service Routes..
...
ws.Handle("/find/{abc}", doFindFunc)
...
}
My question is - what's the recommended practice for recovery from session that has timed out, lost connection (the mongo service provider I'm using is remote, so I assume that this will happen), so on any particular web service call, the database session may no longer work? How do people handle these cases to detect that the session is no longer valid and a "fresh" one should be established?
Thanks!
what you may want is to do the session .Copy() for each incoming HTTP request (with deffered .Close()), copy again from the new session in your handlers if ever needed..
connections and reconnections are managed by mgo, you can stop and restart MongoDB while making an HTTP request to your web service to see how its affected.
if there's a db connection problem while handling an HTTP request, a db operation will eventually timeout (timeout can be configured by using DialWithTimeout instead of the regular Dial, so you can respond with a 5xx HTTP error code in such case.

Resources