websocket gracefully shutdown - go

I have a websocket server. I wrote a test for him that tests his ability to shutdown gracefully. 5 connections are created and each sends 5 requests. After a while, shutdown starts. All 25 requests must be fulfilled. If I close the exit channel, then the test does not work as it should.
time.AfterFunc(50*time.Millisecond, func() {
close(exit)
close(done)
})
And if I just call the s.shutdown function, then everything is ok.
time.AfterFunc(50*time.Millisecond, func() {
require.Nil(t, s.Shutdown())
close(done)
})
My test
func TestServer_GracefulShutdown(t *testing.T) {
done := make(chan struct{})
exit := make(chan struct{})
ctx := context.Background()
finishedRequestCount := atomic.NewInt32(0)
ln, err := net.Listen("tcp", "localhost:")
require.Nil(t, err)
handler := HandlerFunc(func(conn *websocket.Conn) {
for {
_, _, err := conn.ReadMessage()
if err != nil {
return
}
time.Sleep(100 * time.Millisecond)
finishedRequestCount.Inc()
}
})
s, err := makeServer(ctx, handler) // server create
require.Nil(t, err)
time.AfterFunc(50*time.Millisecond, func() {
close(exit)
close(done)
})
go func() {
fmt.Printf("Running...")
require.Nil(t, s.Run(ctx, exit, ln))
}()
for i := 0; i < connCount; i++ {
go func() {
err := clientRun(ln)
require.Nil(t, err)
}()
}
<-done
assert.Equal(t, int32(totalCount), finishedRequestCount.Load())
}
My run func
func (s *Server) Run(ctx context.Context, exit <-chan struct{}, ln net.Listener) error {
errs := make(chan error, 1)
go func() {
err := s.httpServer.Run(ctx, exit, ln)
if err != nil {
errs <- err
}
}()
select {
case <-ctx.Done():
return s.Close()
case <-exit:
return s.Shutdown()
case err := <-errs:
return err
}
}
My shutdown
func (s *Server) Shutdown() error {
err := s.httpServer.Shutdown() // we close the possibility to connect to any conn
s.wg.Wait()
return err
}

What is happening if the following code is executed?
close(exit)
close(done)
Both channels are closed almost at the same time. The first triggers the Shutdown function which waits for a graceful shutdown. But the second triggers the evaluation of
assert.Equal(t, int32(totalCount), finishedRequestCount.Load())
It is triggered while the graceful shutdown is still running or hasn't even started yet.
If you execute the Shutdown function directly it will block until finished and only then close(done) will start the assertion. That is why this works:
require.Nil(t, s.Shutdown())
close(done)
You can move the close(done) to the following location to make the test work while using the exit channel to close:
go func() {
fmt.Printf("Running...")
require.Nil(t, s.Run(ctx, exit, ln))
close(done)
}()
This way done will be closed after the Shutdown function was executed.
As discussed in the comments I strongly suggest to use contexts instead of channels to close. They have the complexity of close channels hidden away.

Related

go routines and channel to send response

I have the following code:
i have a list to go through and do something with a value from that list, and so i thought of using go routines, but i need to use a max number of go routines, and then in go routine i need to make a call that will get a return of response, err, when the err is different from null I need to terminate all the go routines and return an http response, and if there is no err I need to terminate the go routines and return an http response,
When I have few values ​​it works ok, but when I have many values ​​I have a problem, because when I call cancel I will still have go routines trying to send to the response channel that is already closed and I keep getting errors from:
goroutine 36 [chan send]:
type response struct {
value string
}
func Testing() []response {
fakeValues := getFakeValues()
maxParallel := 25
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
if len(fakeValues) < maxParallel {
maxParallel = len(fakeValues)
}
type responseChannel struct {
Response response
Err error
}
reqChan := make(chan string) //make this an unbuffered channel
resChan := make(chan responseChannel)
wg := &sync.WaitGroup{}
wg.Add(maxParallel)
for i := 0; i < maxParallel; i++ {
go func(ctx context.Context, ch chan string, resChan chan responseChannel) {
for {
select {
case val := <-ch:
resp, err := getFakeResult(val)
resChan <- responseChannel{
Response: resp,
Err: err,
}
case <-ctx.Done():
wg.Done()
return
}
}
}(ctx, reqChan, resChan)
}
go func() {
for _, body := range fakeValues {
reqChan <- body
}
close(reqChan)
cancel()
}()
go func() {
wg.Wait()
close(resChan)
}()
var hasErr error
response := make([]response, 0, len(fakeValues))
for res := range resChan {
if res.Err != nil {
hasErr = res.Err
cancel()
break
}
response = append(response, res.Response)
}
if hasErr != nil {
// return responses.ErrorResponse(hasErr) // returns http response
}
// return responses.Accepted(response, nil) // returns http response
return nil
}
func getFakeValues() []string {
return []string{"a"}
}
func getFakeResult(val string) (response, error) {
if val == "" {
return response{}, fmt.Errorf("ooh noh:%s", val)
}
return response{
value: val,
}, nil
}
The workers end up blocked on sending to resChan because it's not buffered, and after an error, nothing reads from it.
You can either make resChan buffered, with a size at least as large as maxParallel. Or check to see if the context was canceled, e.g. change the resChan <- to
select {
case resChan <- responseChannel{
Response: resp,
Err: err,
}:
case <-ctx.Done():
}
There are two main problems with your solution:
First, if your fakeValues slice has more items than maxParallel+1, your program will block on this part:
for _, body := range fakeValues {
reqChan <- body
}
How does this happen? As you start putting values in reqChan, each started goroutine will read one value from the reqChan and try to write the response to resChan. But, since resChan is still not reading responses, each goroutine will block there (writing to resChan). Eventually, once each goroutine is blocked, reading from the reqChan is blocked as well and you cannot put any more values in it (apart from one buffered value).
Second, you are passing the context to your goroutines, but you are not doing anything with it. You can use ctx.Done() channel to get a signal to exit the goroutine. Something like this:
go func(ctx context.Context, ch chan string, resChan chan responseChannel) {
for {
select {
case val := <-ch:
resp, err := getFakeResult(val)
resChan <- responseChannel{
Response: resp,
Err: err,
}
case <- ctx.Done():
return
}
}
}(ctx, reqChan, resChan)
Now, to tie everything together so that there are no deadlocks, no race conditions, and no situations where values are not processed, a few other changes need to be made. I've posted the entire code below.
func Testing() []response {
fakeValues := getFakeValues()
maxParallel := 25
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
if len(fakeValues) < maxParallel {
maxParallel = len(fakeValues)
}
type responseChannel struct {
Response response
Err error
}
reqChan := make(chan string) //make this an unbuffered channel
resChan := make(chan responseChannel)
wg := &sync.WaitGroup{}
wg.Add(maxParallel)
for i := 0; i < maxParallel; i++ {
go func(ctx context.Context, ch chan string, resChan chan responseChannel) {
for {
select {
case val := <-ch:
resp, err := getFakeResult(val)
resChan <- responseChannel{
Response: resp,
Err: err,
}
case <- ctx.Done():
wg.Done()
return
}
}
wg.Done()
}(ctx, reqChan, resChan)
}
go func() {
for _, body := range fakeValues {
reqChan <- body
}
close(reqChan)
//putting cancel here so that it can terminate all goroutines when all values are read from reqChan
cancel()
}()
go func() {
wg.Wait()
close(resChan)
}()
var hasErr error
response := make([]response, 0, len(fakeValues))
for res := range resChan {
if res.Err != nil {
hasErr = res.Err
cancel()
break
}
response = append(response, res.Response)
}
if hasErr != nil {
return responses.ErrorResponse(hasErr) // returns http response
}
return responses.Accepted(response, nil) // returns http response
}
In short, the changes are:
reqChan is an unbuffered channel, as this will help in cases where values might not get processed when we close goroutines that read data from buffered channels.
worker goroutines have been changed to accommodate the cases of both exiting when error happens and when there is no more data from reqChan to process. wg.Done() is executed when the context is canceled to ensure that resChan is eventually closed.
separate goroutine is created to put the data in the reqChan without blocking the program, close it afterward, and cancel the context.

go routine, function executed out of desired order

Hi I'm learning goroutine, and I have a question I have a code where I have a publisher and a listener, I need the subscribe call to happen first then the publish call.
useCase := New(tt.fields.storage)
tt.fields.wg.Add(1)
go func() {
ch, _, err := useCase.Subscribe(tt.args.ctx, tt.args.message.TopicName)
message, ok := <- ch
if !ok {
close(ch)
tt.fields.wg.Done()
}
require.NoError(t, err)
assert.Equal(t, tt.want, message)
}()
err := useCase.Publish(tt.args.ctx, tt.args.message)
tt.fields.wg.Wait()
and as I send messages inside these two functions I need Go Func, how would I get subscribe to always execute first, and then publish.
You can use a channel:
subscribed:=make(chan struct{})
go func() {
ch, _, err := useCase.Subscribe(tt.args.ctx, tt.args.message.TopicName)
if err!=nil {
...
}
close(subscribed)
message, ok := <- ch
...
}()
<-subscribed
err := useCase.Publish(tt.args.ctx, tt.args.message)

Go routines with kafka consumer channel and context

I have a simple kafka consumer for which I have created a handle and trying to read it using a go routine:
func process(ctx context.Context){
consumer := queueHandle.Consume(topic_ops_req, consumerHandler)
// Get signal for finish
doneCh := make(chan struct{})
go func(consumer chan *sarama.ConsumerMessage, ctx context.Context) {
for {
select {
case msg, ok := <-consumer:
if !ok {
logger.Info("Channel has been closed")
doneCh <- struct{}{}
return
}
var request queue.Request
err := json.Unmarshal(msg.Value, &request)
if err != nil {
logger.Error("consumer unmarshal err", err)
panic(err)
}
res, err := new_process(ctx, request, service) // call another func
if err != nil {
//TODO
}
result = res
doneCh <- struct{}{}
case <-ctx.Done():
logger.Info(fmt.Sprintf("Context ended with err : %s", ctx.Err()))
doneCh <- struct{}{}
}
}
}(consumer, ctx)
<-doneCh
}
The issue I am seeing is that once I introduce the "case <-ctx.Done()", the go routine does not enter the "case msg, ok := <-consumer" and always returns that the context ended. How do I my go func work with both consumer channel and ctx.Done() ?

Handling multiple websocket connections

I'm trying to create a program that will connect to several servers though gorilla web-sockets. I currently have a program that will iterate over a list of server addresses and create a new goroutine that will create its own Websocket.conn and handle reading and writing.
The problem is that every time a new goroutine is created the previous goroutines are blocked and only the last one can continue. I believe this is because the gorilla websocket library is blocking each gorotutine, but I might be mistaken.
I have tried putting a timer in the server list iterator and each goroutine will work perfectly but then the moment a new goroutine is made with another address the previous goroutine is blocked.
The relevant bits of my code:
In my main.go
for _, server := range servers {
go control(ctx, server, port)
}
In control()
func control(ctx context.Context, server, port string) {
url := url.URL{
Scheme: "ws",
Host: server + ":" + port,
Path: "",
}
conn, _, err := websocket.DefaultDialer.Dial(url.String(), nil)
if err != nil {
panic(err)
}
defer conn.Close()
go sendHandler(ctx, conn)
go readHandler(ctx, conn)
}
readHandler(ctx context.Context, conn *websocket.Con) {
for {
_, p, err := conn.ReadMessage(); if err != nil {
panic(err)
}
select {
case <-ctx.Done():
goto TERM
default:
// do nothing
}
}
TERM:
// do termination
}
sendHandler(ctx context.Context, conn *websocket.Con) {
for _, msg := range msges {
err = conn.WriteMessage(websocket.TextMessage, msg)
if err != nil {
panic(err)
}
}
<-ctx.Done()
}
I removed the parts where I add waitgroups and other unnecessary pieces of code.
So what I expect is for there to be 3n goroutines running (where n is the number of servers) without blocking but right now I see only 3 goroutines running which are the ones called by the last iteration of the server list.
Thanks!
EDIT 14/06/2019:
I spent some time making a small working example and in the example the bug did not occur - none of the threads blocked each other. I'm still unsure what was causing it but here is my small working example:
main.go
package main
import (
"context"
"fmt"
"os"
"time"
"os/signal"
"syscall"
"sync"
"net/url"
"github.com/gorilla/websocket"
)
func main() {
servers := []string{"5555","5556", "5557"}
comms := make(chan os.Signal, 1)
signal.Notify(comms, os.Interrupt, syscall.SIGTERM)
ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
var wg sync.WaitGroup
for _, server := range servers {
wg.Add(1)
go control(server,
ctx,
&wg)
}
<-comms
cancel()
wg.Wait()
}
func control(server string, ctx context.Context, wg *sync.WaitGroup) {
fmt.Printf("Started control for %s\n", server)
url := url.URL {
Scheme: "ws",
Host: "0.0.0.0" + ":" + server,
Path: "",
}
conn, _, err := websocket.DefaultDialer.Dial(url.String(), nil)
if err != nil {
panic(err)
}
defer conn.Close()
var localwg sync.WaitGroup
localwg.Add(1)
go sendHandler(ctx, conn, &localwg, server)
localwg.Add(1)
go readHandler(ctx, conn, &localwg, server)
<- ctx.Done()
localwg.Wait()
wg.Done()
return
}
func sendHandler(ctx context.Context, conn *websocket.Conn, wg *sync.WaitGroup, server string) {
for i := 0; i < 50; i++ {
err := conn.WriteMessage(websocket.TextMessage, []byte("ping"))
if err != nil {
panic(err)
}
fmt.Printf("sent msg to %s\n", server)
time.Sleep(1 * time.Second)
}
<- ctx.Done()
wg.Done()
}
func readHandler(ctx context.Context, conn *websocket.Conn, wg *sync.WaitGroup, server string) {
for {
select {
case <- ctx.Done():
wg.Done()
return
default:
_, p, err := conn.ReadMessage()
if err != nil {
wg.Done()
fmt.Println("done")
}
fmt.Printf("Got [%s] from %s\n", string(p), server)
}
}
}
I tested it with dpallot's simple-websocket-server by a server on 5555, 5556 and 5557 respectively.
This part of your code is causing the problem:
conn, _, err := websocket.DefaultDialer.Dial(url.String(), nil)
if err != nil {
panic(err)
}
defer conn.Close()
go sendHandler(ctx, conn)
go readHandler(ctx, conn)
You create the connection, defer the close of it, start two other goroutines and then end the function. The function end closes the socket due to your defer.

Go routine leak fix

I am working on a small service at the moment. From my testing, the code I've written has the possibility of leaking go routines under certain circumstances pertaining to the context. Is there a good and/or idiomatic way to remedy this? I'm providing some sample code below.
func Handle(ctx context.Context, r *Req) (*Response, error) {
ctx, cancel := context.WithTimeout(ctx, time.Second * 5)
defer cancel()
resChan := make(chan Response)
errChan := make(chan error)
go process(r, resChan, errChan)
select {
case ctx.Done():
return nil, ctx.Err()
case res := <-resChan:
return &res, nil
case err := <-errChan:
return nil, err
}
}
func process(r *Req, resChan chan<- Response, errChan chan<- error) {
defer close(errChan)
defer close(resChan)
err := doSomeWork()
if err != nil {
errChan <- err
return
}
err = doSomeMoreWork()
if err != nil {
errChan <- err
return
}
res := Response{}
resChan <- res
}
Hypothetically, if the client cancelled the context or the timeout occurred before the process func had a chance to send on one of the unbuffered channels (resChan, errChan), there would be no channel readers left from Handle and sending on the channels would block indefinitely with no readers. Since process would not return in this case, the channels would also not be closed.
I came up with the process2 as a solution, but I can't help thinking I'm doing something wrong, or there's a better way to handle this.
func process2(ctx context.Context, r *Req, resChan chan<- Response, errChan chan<- error) {
defer close(errChan)
defer close(resChan)
err := doSomeWork()
select {
case <-ctx.Done():
return
default:
if err != nil {
errChan <- err
return
}
}
err = doSomeMoreWork()
select {
case <-ctx.Done():
return
default:
if err != nil {
errChan <- err
return
}
}
res := Response{}
select{
case <-ctx.Done():
return
default:
resChan <- res
}
}
This approach makes sure that each time a channel send is attempted, first the context is checked for having been completed or cancelled. If it was, then it does not attempt the send and returns. I'm pretty sure this fixes any go routine leaking happening in the first process func.
Is there a better way? Maybe I have this all wrong.

Resources