grpc server stops receiving messages after sending many messages simultaneously - go

I am implementing a simple grpc service where the summary of a task is to be sent to the grpc server. Everything works fine if I send less number of messages but when I begin to send like 5000 messages the server stops and gets deadline exceeded message in client side. I also tried to reconnect again but found the error message as.
rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: timed out waiting for server handshake
The server shows no error and is alive.
I tried setting GRPC_GO_REQUIRE_HANDSHAKE=off as well but the error still prevails. I also implemented sending summary in batch but same scenerio repeated.
Is there any limitations to number of messages to be sent in grpc?
Here is my service proto
// The Result service definition.
service Result {
rpc ConntectMaster(ConnectionRequest) returns (stream ExecutionCommand) {}
rpc postSummary(Summary) returns(ExecutionCommand) {}
}
message Summary{
int32 successCount = 1;
int32 failedCount = 2;
int32 startTime = 3;
repeated TaskResult results = 4;
bool isLast = 5;
string id = 6;
}
postSummary implementation in sever
// PostSummary posts the summary to the master
func (server *Server) PostSummary(ctx context.Context, in *pb.Summary) (*pb.ExecutionCommand, error) {
for i := 0; i < len(in.Results); i++ {
res := in.Results[i]
log.Printf("%s --> %d Res :: %s, len : %d", in.Id, i, res.Id, len(in.Results))
}
return &pb.ExecutionCommand{Type: stopExec}, nil
}
func postSummaryInBatch(executor *Executor, index int) {
summary := pb.Summary{
SuccessCount: int32(executor.summary.successCount),
FailedCount: int32(executor.summary.failedCount),
Results: []*pb.TaskResult{},
IsLast: false,
}
if index >= len(executor.summary.TaskResults) {
summary.IsLast = true
return
}
var to int
batch := 500
if (index + batch) <= len(executor.summary.TaskResults) {
to = index + batch
} else {
to = len(executor.summary.TaskResults)
}
for i := index; i < to; i++ {
result := executor.summary.TaskResults[i]
taskResult := pb.TaskResult{
Id: result.id,
Msg: result.msg,
Time: result.time,
}
// log.Printf("adding res : %s ", taskResult.Id)
if result.err != nil {
taskResult.IsError = true
}
summary.Results = append(summary.Results, &taskResult)
}
summary.Id = fmt.Sprintf("%d-%d", index, to)
log.Printf("sent from %d to %d ", index, to)
postSummary(executor, &summary, 0)
postSummaryInBatch(executor, to)
}
func postSummary(executor *Executor, summary *pb.Summary, retryCount int) {
ctx, cancel := context.WithTimeout(context.Background(), 3*time.Second)
defer cancel()
cmd, err := client.PostSummary(ctx, summary)
if err != nil {
if retryCount < 3 {
reconnect(executor)
postSummary(executor, summary, retryCount+1)
}
log.Printf(err.Error())
// log.Fatal("cannot send summary report")
} else {
processServerCommand(executor, cmd)
}
}

grpc default maxReceiveMessageSize is 4MB, your grpc client probably went over that limit.
grpc uses h2 in transport layer which opens only one tcp conn and multiplex "requests" over that, reduce significant overhead compare to h1, I wouldn't worry too much for batching and will just make individual calls to grpc server.

Related

read tcp read: connection reset by peer

I've been using the Golang DynamoDB SDK for a while now, and recently I started seeing this error type come back:
RequestError: send request failed
caused by: Post "https://dynamodb.[REGION].amazonaws.com/": read tcp [My IP]->[AWS IP]: read: connection reset by peer
This only seems to occur when writing large amounts of data to DynamoDB, although the error is not limited to any particular type of request. I've seen it in both UpdateItem and BatchWriteItem requests. Furthermore, as the failure isn't consistent, I can't localize it to a particular line of code. It seems that the error is related to some sort of network issue between my service and AWS but, as it doesn't come back as a throttling exception, I'm not sure how to debug it. Finally, as the response comes back from a write request, I don't think retry logic is really the solution here either.
Here's my batch-write code:
func (conn *Connection) BatchWrite(tableName string, requests []*dynamodb.WriteRequest) error {
// Get the length of the requests; if there aren't any then return because there's nothing to do
length := len(requests)
log.Printf("Attempting to write %d items to DynamoDB", length)
if length == 0 {
return nil
}
// Get the number of requests to make
numRequests := length / 25
if length%25 != 0 {
numRequests++
}
// Create the variables necessary to manage the concurrency
var wg sync.WaitGroup
errs := make(chan error, numRequests)
// Attempt to batch-write the requests to DynamoDB; because DynamoDB limits the number of concurrent
// items in a batch request to 25, we'll chunk the requests into 25-report segments
sections := make([][]*dynamodb.WriteRequest, numRequests)
for i := 0; i < numRequests; i++ {
// Get the end index which is 25 greater than the current index or the end of the array
// if we're getting close
end := (i + 1) * 25
if end > length {
end = length
}
// Add to the wait group so that we can ensure all the concurrent processes finish
// before we close down the process
wg.Add(1)
// Write the chunk to DynamoDB concurrently
go func(wg *sync.WaitGroup, index int, start int, end int) {
defer wg.Done()
// Call the DynamoDB operation; record any errors that occur
if section, err := conn.batchWriteInner(tableName, requests[start:end]); err != nil {
errs <- err
} else {
sections[index] = section
}
}(&wg, i, i*25, end)
}
// Wait for all the goroutines to finish
wg.Wait()
// Attempt to read an error from the channel; if we get one then return it
// Otherwise, continue. We have to use the select here because this is
// the only way to attempt to read from a channel without it blocking
select {
case err, ok := <-errs:
if ok {
return err
}
default:
break
}
// Now, we've probably gotten retries back so take these and combine them into
// a single list of requests
retries := sections[0]
if len(sections) > 1 {
for _, section := range sections[1:] {
retries = append(retries, section...)
}
}
// Rewrite the requests and return the result
return conn.BatchWrite(tableName, retries)
}
func (conn *Connection) batchWriteInner(tableName string, requests []*dynamodb.WriteRequest) ([]*dynamodb.WriteRequest, error) {
// Create the request
request := dynamodb.BatchWriteItemInput{
ReturnConsumedCapacity: aws.String(dynamodb.ReturnConsumedCapacityNone),
ReturnItemCollectionMetrics: aws.String(dynamodb.ReturnItemCollectionMetricsNone),
RequestItems: map[string][]*dynamodb.WriteRequest{
tableName: requests,
},
}
// Attempt to batch-write the items with an exponential backoff
var result *dynamodb.BatchWriteItemOutput
err := backoff.Retry(func() error {
// Attempt the batch-write; if it fails then back-off and wait. Otherwise break out
// of the loop and return
var err error
if result, err = conn.inner.BatchWriteItem(&request); err != nil {
// If we have an error then what we do here will depend on the error code
// If the error code is for exceeded throughput, exceeded request limit or
// an internal server error then we'll try again. Otherwise, we'll break out
// because the error isn't recoverable
if aerr, ok := err.(awserr.Error); ok {
switch aerr.Code() {
case dynamodb.ErrCodeProvisionedThroughputExceededException:
case dynamodb.ErrCodeRequestLimitExceeded:
case dynamodb.ErrCodeInternalServerError:
return err
}
}
// We received an error that won't be fixed by backing off; return this as a permanent
// error so we can tell the backoff library that we want to break out of the exponential backoff
return backoff.Permanent(err)
}
return nil
}, backoff.NewExponentialBackOff())
// If the batch-write failed then return an error
if err != nil {
return nil, err
}
// Roll the unprocessed items into a single list and return them
var list []*dynamodb.WriteRequest
for _, item := range result.UnprocessedItems {
list = append(list, item...)
}
return list, nil
}
Has anyone else dealt with this issue before? What's the correct approach here?

Why is data being pushed into the channel but never read from the receiver goroutine?

I am building a daemon and I have two services that will be sending data to and from each other. Service A is what produces the data and service B a is Data Buffer service or like a queue. So from the main.go file, service B is instantiated and started. The Start() method will perform the buffer() function as a goroutine because this function waits for data to be passed onto a channel and I don't want the main process to halt waiting for buffer to complete. Then Service A is instantiated and started. It is then also "registered" with Service B.
I created a method called RegisterWithBufferService for Service A that creates two new channels. It will store those channels as it's own attributes and also provide them to Service B.
func (s *ServiceA) RegisterWithBufferService(bufService *data.DataBuffer) error {
newIncomingChan := make(chan *data.DataFrame, 1)
newOutgoingChan := make(chan []byte, 1)
s.IncomingBuffChan = newIncomingChan
s.OutgoingDataChannels = append(s.OutgoingDataChannels, newOutgoingChan)
bufService.DataProviders[s.ServiceName()] = data.DataProviderInfo{
IncomingChan: newOutgoingChan, //our outGoing channel is their incoming
OutgoingChan: newIncomingChan, // our incoming channel is their outgoing
}
s.DataBufferService = bufService
bufService.NewProvider <- s.ServiceName() //The DataBuffer service listens for new services and creates a new goroutine for buffering
s.Logger.Info().Msg("Registeration completed.")
return nil
}
Buffer essentially listens for incoming data from Service A, decodes it using Decode() and then adds it to a slice called buf. If the slice is greater in length than bufferPeriod then it will send the first item in the slice in the Outgoing channel back to Service A.
func (b* DataBuffer) buffer(bufferPeriod int) {
for {
select {
case newProvider := <- b.NewProvider:
b.wg.Add(1)
/*
newProvider is a string
DataProviders is a map the value it returns is a struct containing the Incoming and
Outgoing channels for this service
*/
p := b.DataProviders[newProvider]
go func(prov string, in chan []byte, out chan *DataFrame) {
defer b.wg.Done()
var buf []*DataFrame
for {
select {
case rawData := <-in:
tmp := Decode(rawData) //custom decoding function. Returns a *DataFrame
buf = append(buf, tmp)
if len(buf) < bufferPeriod {
b.Logger.Info().Msg("Sending decoded data out.")
out <- buf[0]
buf = buf[1:] //pop
}
case <- b.Quit:
return
}
}
}(newProvider, p.IncomingChan, p.OutgoingChan)
}
case <- b.Quit:
return
}
}
Now Service A has a method called record that will periodically push data to all the channels in it's OutgoingDataChannels attribute.
func (s *ServiceA) record() error {
...
if atomic.LoadInt32(&s.Listeners) != 0 {
s.Logger.Info().Msg("Sending raw data to data buffer")
for _, outChan := range s.OutgoingDataChannels {
outChan <- dataBytes // the receiver (Service B) is already listening and this doesn't hang
}
s.Logger.Info().Msg("Raw data sent and received") // The logger will output this so I know it's not hanging
}
}
The problem is that Service A seems to push the data successfully using record but Service B never goes into the case rawData := <-in: case in the buffer sub-goroutine. Is this because I have nested goroutines? Incase it's not clear, when Service B is started, it calls buffer but because it would hang otherwise, I made the call to buffer a goroutine. So then when Service A calls RegisterWithBufferService, the buffer goroutine creates a goroutine to listen for new data from Service B and push it back to Service A once the buffer is filled. I hope I explained it clearly.
EDIT 1
I've made a minimal, reproducible example.
package main
import (
"fmt"
"sync"
"sync/atomic"
"time"
)
var (
defaultBufferingPeriod int = 3
DefaultPollingInterval int64 = 10
)
type DataObject struct{
Data string
}
type DataProvider interface {
RegisterWithBufferService(*DataBuffer) error
ServiceName() string
}
type DataProviderInfo struct{
IncomingChan chan *DataObject
OutgoingChan chan *DataObject
}
type DataBuffer struct{
Running int32 //used atomically
DataProviders map[string]DataProviderInfo
Quit chan struct{}
NewProvider chan string
wg sync.WaitGroup
}
func NewDataBuffer() *DataBuffer{
var (
wg sync.WaitGroup
)
return &DataBuffer{
DataProviders: make(map[string]DataProviderInfo),
Quit: make(chan struct{}),
NewProvider: make(chan string),
wg: wg,
}
}
func (b *DataBuffer) Start() error {
if ok := atomic.CompareAndSwapInt32(&b.Running, 0, 1); !ok {
return fmt.Errorf("Could not start Data Buffer Service.")
}
go b.buffer(defaultBufferingPeriod)
return nil
}
func (b *DataBuffer) Stop() error {
if ok := atomic.CompareAndSwapInt32(&b.Running, 1, 0); !ok {
return fmt.Errorf("Could not stop Data Buffer Service.")
}
for _, p := range b.DataProviders {
close(p.IncomingChan)
close(p.OutgoingChan)
}
close(b.Quit)
b.wg.Wait()
return nil
}
// buffer creates goroutines for each incoming, outgoing data pair and decodes the incoming bytes into outgoing DataFrames
func (b *DataBuffer) buffer(bufferPeriod int) {
for {
select {
case newProvider := <- b.NewProvider:
fmt.Println("Received new Data provider.")
if _, ok := b.DataProviders[newProvider]; ok {
b.wg.Add(1)
p := b.DataProviders[newProvider]
go func(prov string, in chan *DataObject, out chan *DataObject) {
defer b.wg.Done()
var (
buf []*DataObject
)
fmt.Printf("Waiting for data from: %s\n", prov)
for {
select {
case inData := <-in:
fmt.Printf("Received data from: %s\n", prov)
buf = append(buf, inData)
if len(buf) > bufferPeriod {
fmt.Printf("Queue is filled, sending data back to %s\n", prov)
out <- buf[0]
fmt.Println("Data Sent")
buf = buf[1:] //pop
}
case <- b.Quit:
return
}
}
}(newProvider, p.IncomingChan, p.OutgoingChan)
}
case <- b.Quit:
return
}
}
}
type ServiceA struct{
Active int32 // atomic
Stopping int32 // atomic
Recording int32 // atomic
Listeners int32 // atomic
name string
QuitChan chan struct{}
IncomingBuffChan chan *DataObject
OutgoingBuffChans []chan *DataObject
DataBufferService *DataBuffer
}
// A compile time check to ensure ServiceA fully implements the DataProvider interface
var _ DataProvider = (*ServiceA)(nil)
func NewServiceA() (*ServiceA, error) {
var newSliceOutChans []chan *DataObject
return &ServiceA{
QuitChan: make(chan struct{}),
OutgoingBuffChans: newSliceOutChans,
name: "SERVICEA",
}, nil
}
// Start starts the service. Returns an error if any issues occur
func (s *ServiceA) Start() error {
atomic.StoreInt32(&s.Active, 1)
return nil
}
// Stop stops the service. Returns an error if any issues occur
func (s *ServiceA) Stop() error {
atomic.StoreInt32(&s.Stopping, 1)
close(s.QuitChan)
return nil
}
func (s *ServiceA) StartRecording(pol_int int64) error {
if ok := atomic.CompareAndSwapInt32(&s.Recording, 0, 1); !ok {
return fmt.Errorf("Could not start recording. Data recording already started")
}
ticker := time.NewTicker(time.Duration(pol_int) * time.Second)
go func() {
for {
select {
case <-ticker.C:
fmt.Println("Time to record...")
err := s.record()
if err != nil {
return
}
case <-s.QuitChan:
ticker.Stop()
return
}
}
}()
return nil
}
func (s *ServiceA) record() error {
current_time := time.Now()
ct := fmt.Sprintf("%02d-%02d-%d", current_time.Day(), current_time.Month(), current_time.Year())
dataObject := &DataObject{
Data: ct,
}
if atomic.LoadInt32(&s.Listeners) != 0 {
fmt.Println("Sending data to Data buffer...")
for _, outChan := range s.OutgoingBuffChans {
outChan <- dataObject // the receivers should already be listening
}
fmt.Println("Data sent.")
}
return nil
}
// RegisterWithBufferService satisfies the DataProvider interface. It provides the bufService with new incoming and outgoing channels along with a polling interval
func (s ServiceA) RegisterWithBufferService(bufService *DataBuffer) error {
if _, ok := bufService.DataProviders[s.ServiceName()]; ok {
return fmt.Errorf("%v data provider already registered with Data Buffer.", s.ServiceName())
}
newIncomingChan := make(chan *DataObject, 1)
newOutgoingChan := make(chan *DataObject, 1)
s.IncomingBuffChan = newIncomingChan
s.OutgoingBuffChans = append(s.OutgoingBuffChans, newOutgoingChan)
bufService.DataProviders[s.ServiceName()] = DataProviderInfo{
IncomingChan: newOutgoingChan, //our outGoing channel is their incoming
OutgoingChan: newIncomingChan, // our incoming channel is their outgoing
}
s.DataBufferService = bufService
bufService.NewProvider <- s.ServiceName() //The DataBuffer service listens for new services and creates a new goroutine for buffering
return nil
}
// ServiceName satisfies the DataProvider interface. It returns the name of the service.
func (s ServiceA) ServiceName() string {
return s.name
}
func main() {
var BufferedServices []DataProvider
fmt.Println("Instantiating and Starting Data Buffer Service...")
bufService := NewDataBuffer()
err := bufService.Start()
if err != nil {
panic(fmt.Sprintf("%v", err))
}
defer bufService.Stop()
fmt.Println("Data Buffer Service successfully started.")
fmt.Println("Instantiating and Starting Service A...")
serviceA, err := NewServiceA()
if err != nil {
panic(fmt.Sprintf("%v", err))
}
BufferedServices = append(BufferedServices, *serviceA)
err = serviceA.Start()
if err != nil {
panic(fmt.Sprintf("%v", err))
}
defer serviceA.Stop()
fmt.Println("Service A successfully started.")
fmt.Println("Registering services with Data Buffer...")
for _, s := range BufferedServices {
_ = s.RegisterWithBufferService(bufService) // ignoring error msgs for base case
}
fmt.Println("Registration complete.")
fmt.Println("Beginning recording...")
_ = atomic.AddInt32(&serviceA.Listeners, 1)
err = serviceA.StartRecording(DefaultPollingInterval)
if err != nil {
panic(fmt.Sprintf("%v", err))
}
for {
select {
case RTD := <-serviceA.IncomingBuffChan:
fmt.Println(RTD)
case <-serviceA.QuitChan:
atomic.StoreInt32(&serviceA.Listeners, 0)
bufService.Quit<-struct{}{}
}
}
}
Running on Go 1.17. When running the example, it should print the following every 10 seconds:
Time to record...
Sending data to Data buffer...
Data sent.
But then Data buffer never goes into the inData := <-in case.
To diagnose this I changed fmt.Println("Sending data to Data buffer...") to fmt.Println("Sending data to Data buffer...", s.OutgoingBuffChans) and the output was:
Time to record...
Sending data to Data buffer... []
So you are not actually sending the data to any channels. The reason for this is:
func (s ServiceA) RegisterWithBufferService(bufService *DataBuffer) error {
As the receiver is not a pointer when you do the s.OutgoingBuffChans = append(s.OutgoingBuffChans, newOutgoingChan) you are changing s.OutgoingBuffChans in a copy of the ServiceA which is discarded when the function exits. To fix this change:
func (s ServiceA) RegisterWithBufferService(bufService *DataBuffer) error {
to
func (s *ServiceA) RegisterWithBufferService(bufService *DataBuffer) error {
and
BufferedServices = append(BufferedServices, *serviceA)
to
BufferedServices = append(BufferedServices, serviceA)
The amended version outputs:
Time to record...
Sending data to Data buffer... [0xc0000d8060]
Data sent.
Received data from: SERVICEA
Time to record...
Sending data to Data buffer... [0xc0000d8060]
Data sent.
Received data from: SERVICEA
So this resolves the reported issue (I would not be suprised if there are other issues but hopefully this points you in the right direction). I did notice that the code you originally posted does use a pointer receiver so that might have suffered from another issue (but its difficult to comment on code fragments in a case like this).

How to implement a function using channels

There is a function that is running in goroutine:
func (c *controlUC) WebhookPool() {
for {
if len(c.webhookPool) == 0 {
continue
}
for i := 0; i < len(c.webhookPool); i++ {
if !c.webhookPool[i].LastSentTime.IsZero() && time.Now().Before(c.webhookPool[i].LastSentTime.Add(GetDelayBySentCount(c.webhookPool[i].SendCount))) {
continue
}
var headers = make(map[string]string)
headers["Content-type"] = "application/json"
_, statusCode, err := c.fhttpClient.Request("POST", c.webhookPool[i].Path, c.webhookPool[i].Body, nil, headers)
if err != nil {
c.logger.Error(err)
return
}
if statusCode != 200 {
if c.webhookPool[i].SendCount >= 2 {
c.webhookPool = append(c.webhookPool[:i], c.webhookPool[i+1:]...)
i--
continue
}
c.webhookPool[i].SendCount++
} else {
c.webhookPool = append(c.webhookPool[:i], c.webhookPool[i+1:]...)
i--
continue
}
c.webhookPool[i].LastSentTime = time.Now()
}
}
}
// webhookPool []models.WebhookPoolElem
type WebhookPoolElem struct {
SendCount int
LastSentTime time.Time
Path string
Body []byte
}
The webhookPoolElem element is added to c.webhookpool, after which a request is sent to the server (the path is taken from WebhookPoolElem.path). If the server returned a non - 200 200, then I need to send the request again, after X seconds (taken from GetDelayBySentCount(), depending on SendCount returns different times). The number of attempts is limited (c.webhookpool[i].SendCount >= 2)
But maybe this function needs to be done through channels? If so, how?
Lets say controlUC receiver has a field webhookPool chan WebhookPoolElem and init as webhookPool: make(chan WebhookPoolElem, n) with n as buffer.
You can receive elements and more or less replace c.webhookPool[i] to elem. Rewrite like this:
func (c *controlUC) WebhookPool() {
for {
elem, open := <-c.webhookPool
if !open {
return
}
if !elem.LastSentTime.IsZero() && time.Now().Before(elem.LastSentTime.Add(GetDelayBySentCount(elem.SendCount))) {
continue
}
// I omit http request
if statusCode != 200 {
if elem.SendCount >= 2 {
// drop message from channel, no need to do anything
continue
}
elem.SendCount++
elem.LastSentTime = time.Now()
c.webhookPool <- elem // enqueue again
}
}
I suggest buffered channel so the last send c.webhookPool <- elem does not block, but it's best if you place the send in a select so if the send can not proceed regardless of the buffer, the goroutine doesn't block:
select {
case c.webhookPool <- elem:
// success
default:
// can not send
}

client streaming protocol violation while creating grpc server stream endpoint in go

I am trying to create a grpc server streaming endpoint.
Here is my protobuf file
syntax = "proto3";
option go_package = "mirror_streampb";
option java_package = "com.mirror_stream";
option java_outer_classname = "StreamIdsProto";
option java_multiple_files = true;
package mirror_stream;
service StreamIDs {
rpc ListIDs(ListIDsRequest) returns (stream ListIDsResponse) {}
}
message ListIDsRequest {
int32 num = 1;
}
message ListIDsResponse {
int32 num = 1;
int32 id = 2;
}
And here is my golang implementation for that method. Which only returns some random numbers.
func (s *Server) ListIDs(req *streampb.ListIDsRequest, stream streampb.StreamIDs_ListIDsServer) error {
for i := int32(0); i < req.Num; i++ {
resp := &streampb.ListIDsResponse{
Num: i,
Id: int32(rand.Intn(10000000)),
}
if err := stream.Send(resp); err != nil {
return err
}
}
return nil
}
So when I try to call that method, I get this error Failed while making call: code:unknown message:grpc: client streaming protocol violation: get <nil>, want <EOF>
I am not sure why it is coming from and where it is coming from.
Can anyone help me figure this out?

Graceful shutdown of gRPC downstream

Using the following proto buffer code :
syntax = "proto3";
package pb;
message SimpleRequest {
int64 number = 1;
}
message SimpleResponse {
int64 doubled = 1;
}
// All the calls in this serivce preform the action of doubling a number.
// The streams will continuously send the next double, eg. 1, 2, 4, 8, 16.
service Test {
// This RPC streams from the server only.
rpc Downstream(SimpleRequest) returns (stream SimpleResponse);
}
I'm able to successfully open a stream, and continuously get the next doubled number from the server.
My go code for running this looks like :
ctxDownstream, cancel := context.WithCancel(ctx)
downstream, err := testClient.Downstream(ctxDownstream, &pb.SimpleRequest{Number: 1})
for {
responseDownstream, err := downstream.Recv()
if err != io.EOF {
println(fmt.Sprintf("downstream response: %d, error: %v", responseDownstream.Doubled, err))
if responseDownstream.Doubled >= 32 {
break
}
}
}
cancel() // !!This is not a graceful shutdown
println(fmt.Sprintf("%v", downstream.Trailer()))
The problem I'm having is using a context cancellation means my downstream.Trailer() response is empty. Is there a way to gracefully close this connection from the client side and receive downstream.Trailer().
Note: if I close the downstream connection from the server side, my trailers are populated. But I have no way of instructing my server side to close this particular stream. So there must be a way to gracefully close a stream client side.
Thanks.
As requested some server code :
func (b *binding) Downstream(req *pb.SimpleRequest, stream pb.Test_DownstreamServer) error {
request := req
r := make(chan *pb.SimpleResponse)
e := make(chan error)
ticker := time.NewTicker(200 * time.Millisecond)
defer func() { ticker.Stop(); close(r); close(e) }()
go func() {
defer func() { recover() }()
for {
select {
case <-ticker.C:
response, err := b.Endpoint(stream.Context(), request)
if err != nil {
e <- err
}
r <- response
}
}
}()
for {
select {
case err := <-e:
return err
case response := <-r:
if err := stream.Send(response); err != nil {
return err
}
request.Number = response.Doubled
case <-stream.Context().Done():
return nil
}
}
}
You will still need to populate the trailer with some information. I use the grpc.StreamServerInterceptor to do this.
According to the grpc go documentation
Trailer returns the trailer metadata from the server, if there is any.
It must only be called after stream.CloseAndRecv has returned, or
stream.Recv has returned a non-nil error (including io.EOF).
So if you want to read the trailer in client try something like this
ctxDownstream, cancel := context.WithCancel(ctx)
defer cancel()
for {
...
// on error or EOF
break;
}
println(fmt.Sprintf("%v", downstream.Trailer()))
Break from the infinate loop when there is a error and print the trailer. cancel will be called at the end of the function as it is deferred.
I can't find a reference that explains it clearly, but this doesn't appear to be possible.
On the wire, grpc-status is followed by the trailer metadata when the call completes normally (i.e. the server exits the call).
When the client cancels the call, neither of these are sent.
Seems that gRPC treats call cancellation as a quick abort of the rpc, not much different than the socket being dropped.
Adding a "cancel message" via request streaming works; the server can pick this up and cancel the stream from its end and trailers will still get sent:
message SimpleRequest {
oneof RequestType {
int64 number = 1;
bool cancel = 2;
}
}
....
rpc Downstream(stream SimpleRequest) returns (stream SimpleResponse);
Although this does add a bit of complication to the code.

Resources