Golang Nats subscribe issue - go

I work currently on a micro service architecture.
Before I insert NATS into my project I wanted to test some simple scenarios with it.
In one scenario I have a simple publisher, which publishes 100.000 messages in a for loop over a basic Nats server running on localhost:4222.
The big problem with it, is the subscriber. When he receive between 30.000 - 40.000 messages my whole main.go program and all other go routines just stops and do nothing. I can just quit with ctrl + c. But the Publisher is still keep sending the messages. When I open a new terminal and start a new instance of the subscriber all again works well, till the Subscriber receive about 30000 messages. And the worst thing is that there appears not even one error and also no logs on the server so I have no idea whats going on.
After that I was trying replace the Subscribe-method with the QueueSubscribe-method and all works fine.
What is the main difference between Subscribe and QueueSubscribe?
Is NATS-Streaming a better opportunity? Or in which cases I should prefer Streaming and in which the standard NATS-Server
Here is my code:
Publisher:
package main
import (
"fmt"
"log"
"time"
"github.com/nats-io/go-nats"
)
func main() {
go createPublisher()
for {
}
}
func createPublisher() {
log.Println("pub started")
nc, err := nats.Connect(nats.DefaultURL)
if err != nil {
log.Fatal(err)
}
defer nc.Close()
msg := make([]byte, 16)
for i := 0; i < 100000; i++ {
nc.Publish("alenSub", msg)
if (i % 100) == 0 {
fmt.Println("i", i)
}
time.Sleep(time.Millisecond)
}
log.Println("pub finish")
nc.Flush()
}
Subscriber:
package main
import (
"fmt"
"log"
"time"
"github.com/nats-io/go-nats"
)
var received int64
func main() {
received = 0
go createSubscriber()
go check()
for {
}
}
func createSubscriber() {
log.Println("sub started")
nc, err := nats.Connect(nats.DefaultURL)
if err != nil {
log.Fatal(err)
}
defer nc.Close()
nc.Subscribe("alenSub", func(msg *nats.Msg) {
received++
})
nc.Flush()
for {
}
}
func check() {
for {
fmt.Println("-----------------------")
fmt.Println("still running")
fmt.Println("received", received)
fmt.Println("-----------------------")
time.Sleep(time.Second * 2)
}
}

The infinite for loops are likely starving the garbage collector: https://github.com/golang/go/issues/15442#issuecomment-214965471
I was able to reproduce the issue by just running the publisher. To resolve, I recommend using a sync.WaitGroup. Here's how I updated the code linked to in the comments to get it to complete:
package main
import (
"fmt"
"log"
"sync"
"time"
"github.com/nats-io/go-nats"
)
// create wait group
var wg sync.WaitGroup
func main() {
// add 1 waiter
wg.Add(1)
go createPublisher()
// wait for wait group to complete
wg.Wait()
}
func createPublisher() {
log.Println("pub started")
// mark wait group done after createPublisher completes
defer wg.Done()
nc, err := nats.Connect(nats.DefaultURL)
if err != nil {
log.Fatal(err)
}
defer nc.Close()
msg := make([]byte, 16)
for i := 0; i < 100000; i++ {
if errPub := nc.Publish("alenSub", msg); errPub != nil {
panic(errPub)
}
if (i % 100) == 0 {
fmt.Println("i", i)
}
time.Sleep(time.Millisecond * 1)
}
log.Println("pub finish")
errFlush := nc.Flush()
if errFlush != nil {
panic(errFlush)
}
errLast := nc.LastError()
if errLast != nil {
panic(errLast)
}
}
I'd recommend updating the above subscriber code similarly.
The main difference between Subscribe and QueueSubscriber is that in Subscribe all subscribers are sent all messages from. While in QueueSubscribe only one subscriber in a QueueGroup is sent each message.
Some details on additional features for NATS Streaming are here:
https://nats.io/documentation/streaming/nats-streaming-intro/
We see both NATS and NATS Streaming used in a variety of use cases from data pipelines to control planes. Your choice should be driven by the needs of your use case.

As stated, remove the for{} loop. Replace with runtime.Goexit().
For subscriber you don't need to create the subscriber in a Go routine. Async subscribers already have their own Go routine for callbacks.
Also protected the received variable with atomic or a mutex.
See the examples here as well.
https://github.com/nats-io/go-nats/tree/master/examples

Related

How can I ensure that a spawned go routine finishes processing an array on program terminiation

I am processing records from a kafka topic. The endpoint I need to send these records to supports sending an array of up to 100 records. the kafka records also contains information for performing the rest call (currently only 1 to 2 variations, but this will increase as the number of different record types are processed). I am currently loading a struct array of the unique configs when they are found, and each of these configs have their own queue array. For each config, I spawn a new go routine that will process any records in its queue on a timer (for example 100ms). This process works just fine currently. The issue I am having is when the program shuts down. I do not want to leave any unsent records in the queue and want to finish processing them before app shuts down. The below current code handles the interrupt and starts checking the queue depths, but once the interrupt happens, the queue count does not ever decrease, so the program will never terminate. Any thoughts would be appreciated.
package main
import (
"context"
"encoding/json"
"os"
"os/signal"
"strconv"
"syscall"
"time"
_ "time/tzdata"
"go.uber.org/zap"
"go.uber.org/zap/zapcore"
)
type ChannelDetails struct {
ChannelDetails MsgChannel
LastUsed time.Time
Active bool
Queue []OutputMessage
}
type OutputMessage struct {
Config MsgConfig `json:"config"`
Message string `json:"message"`
}
type MsgConfig struct {
Channel MsgChannel `json:"channel"`
}
type MsgChannel struct {
Id int `json:"id"`
MntDate string `json:"mntDate"`
Otype string `json:"oType"`
}
var channels []ChannelDetails
func checkQueueDepths() int {
var depth int = 0
for _, c := range channels {
depth += len(c.Queue)
}
return depth
}
func TimeIn(t time.Time, name string) (time.Time, error) {
loc, err := time.LoadLocation(name)
if err == nil {
t = t.In(loc)
}
return t, err
}
func find(channel *MsgChannel) int {
for i, c := range channels {
if c.ChannelDetails.Id == channel.Id &&
c.ChannelDetails.MntDate == channel.MntDate {
return i
}
}
return len(channels)
}
func splice(queue []OutputMessage, count int) (ret []OutputMessage, deleted []OutputMessage) {
ret = make([]OutputMessage, len(queue)-count)
deleted = make([]OutputMessage, count)
copy(deleted, queue[0:count])
copy(ret, queue[:0])
copy(ret[0:], queue[0+count:])
return
}
func load(msg OutputMessage, logger *zap.Logger) {
i := find(&msg.Config.Channel)
if i == len(channels) {
channels = append(channels, ChannelDetails{
ChannelDetails: msg.Config.Channel,
LastUsed: time.Now(),
Active: false,
Queue: make([]OutputMessage, 0, 200),
})
}
channels[i].LastUsed = time.Now()
channels[i].Queue = append(channels[i].Queue, msg)
if !channels[i].Active {
channels[i].Active = true
go process(&channels[i], logger)
}
}
func process(data *ChannelDetails, logger *zap.Logger) {
for {
// if Queue is empty and not used for 5 minutes, flag as inActive and shut down go routine
if len(data.Queue) == 0 &&
time.Now().After(data.LastUsed.Add(time.Second*10)) { //reduced for example
data.Active = false
logger.Info("deactivating routine as queue is empty")
break
}
// if Queue has records, process
if len(data.Queue) != 0 {
drainStart, _ := TimeIn(time.Now(), "America/New_York")
spliceCnt := len(data.Queue)
if spliceCnt > 100 {
spliceCnt = 100 // rest api endpoint can only accept array up to 100 items
}
items := []OutputMessage{}
data.Queue, items = splice(data.Queue, spliceCnt)
//process items ... will send array of items to a rest endpoint in another go routine
drainEnd, _ := TimeIn(time.Now(), "America/New_York")
logger.Info("processing records",
zap.Int("numitems", len(items)),
zap.String("start", drainStart.Format("2006-01-02T15:04:05.000-07:00")),
zap.String("end", drainEnd.Format("2006-01-02T15:04:05.000-07:00")),
)
}
time.Sleep(time.Millisecond * time.Duration(500))
}
}
func initZapLog() *zap.Logger {
config := zap.NewProductionConfig()
config.EncoderConfig.TimeKey = "timestamp"
config.EncoderConfig.EncodeTime = zapcore.ISO8601TimeEncoder
logger, _ := config.Build()
zap.ReplaceGlobals(logger)
return logger
}
func main() {
ctx, cancel := context.WithCancel(context.Background())
logger := initZapLog()
defer logger.Sync()
test1 := `{
"config": {
"channel": {
"id": 1,
"mntDate": "2021-12-01",
"oType": "test1"
}
},
"message": "test message1"
}`
test2 := `{
"config": {
"channel": {
"id": 2,
"mntDate": "2021-12-01",
"oType": "test2"
}
},
"message": "test message2"
}`
var testMsg1 OutputMessage
err := json.Unmarshal([]byte(test1), &testMsg1)
if err != nil {
logger.Panic("unable to unmarshall test1 data " + err.Error())
}
var testMsg2 OutputMessage
err = json.Unmarshal([]byte(test2), &testMsg2)
if err != nil {
logger.Panic("unable to unmarshall test2 data " + err.Error())
}
exitCh := make(chan struct{})
go func(ctx context.Context) {
for {
//original data is streamed from kafka
load(testMsg1, logger)
load(testMsg2, logger)
time.Sleep(time.Millisecond * time.Duration(5))
select {
case <-ctx.Done():
logger.Info("received done")
var depthChk int
for {
depthChk = checkQueueDepths()
if depthChk == 0 {
break
} else {
logger.Info("Still processing queues. Msgs left: " + strconv.Itoa(depthChk))
}
time.Sleep(100 * time.Millisecond)
}
exitCh <- struct{}{}
return
default:
}
}
}(ctx)
sigs := make(chan os.Signal, 1)
signal.Notify(sigs, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)
go func() {
<-sigs
depths := checkQueueDepths()
logger.Info("You pressed ctrl + C. Queue depth is: " + strconv.Itoa(depths))
cancel()
}()
<-exitCh
}
example logs:
{"level":"info","timestamp":"2021-12-28T15:26:06.136-0500","caller":"testgo/main.go:116","msg":"processing records","numitems":91,"start":"2021-12-28T15:26:06.136-05:00","end":"2021-12-28T15:26:06.136-05:00"}
{"level":"info","timestamp":"2021-12-28T15:26:06.636-0500","caller":"testgo/main.go:116","msg":"processing records","numitems":92,"start":"2021-12-28T15:26:06.636-05:00","end":"2021-12-28T15:26:06.636-05:00"}
^C{"level":"info","timestamp":"2021-12-28T15:26:06.780-0500","caller":"testgo/main.go:205","msg":"You pressed ctrl + C. Queue depth is: 2442"}
{"level":"info","timestamp":"2021-12-28T15:26:06.783-0500","caller":"testgo/main.go:182","msg":"received done"}
{"level":"info","timestamp":"2021-12-28T15:26:06.783-0500","caller":"testgo/main.go:189","msg":"Still processing queues. Msgs left: 2442"} --line repeats forever
The sync golang package https://pkg.go.dev/sync has the Wait group type that allows you to wait for a group of go routines to complete before the main routine returns.
The best usage example is in this blog post:
https://go.dev/blog/pipelines
To 'wait' for all spawned goroutines from inside the main goroutine to finish, there's 2 ways to do this. The most simple would be to add a
runtime.Goexit()
to the end of your main goroutine, after <-exitCh
Simply, it does this:
"Calling Goexit from the main goroutine terminates that goroutine without func main returning. Since func main has not returned, the program continues execution of other goroutines. If all other goroutines exit, the program crashes."
The other way would be to use a waitgroup, think of a waitgroup as a counter, with a method where the program will 'wait' on the line where the method is called till the counter hits zero:
var wg sync.WaitGroup // declare the waitgroup
Then inside each goroutine that you are to wait on, you add/increment the waitgroup:
wg.Add() // you typically call this for each spawned goroutine
Then when you want to state that the goroutine has finished work, you call
wg.Done() // when you consider the spawned routine to be done call this
Which decrements the counter
Then where you want the code to 'wait' till the counter is zero, you add line:
wg.Wait() // wait here till counter hits zero
And the code will block till the number goroutines that are counted with Add() and decremented with Done() hits zero

Print message if Semaphore blocks for too long, but don't unblock caller when message is printed

I have the following code in Go using the semaphore library just as an example:
package main
import (
"fmt"
"context"
"time"
"golang.org/x/sync/semaphore"
)
// This protects the lockedVar variable
var lock *semaphore.Weighted
// Only one go routine should be able to access this at once
var lockedVar string
func acquireLock() {
err := lock.Acquire(context.TODO(), 1)
if err != nil {
panic(err)
}
}
func releaseLock() {
lock.Release(1)
}
func useLockedVar() {
acquireLock()
fmt.Printf("lockedVar used: %s\n", lockedVar)
releaseLock()
}
func causeDeadLock() {
acquireLock()
// calling this from a function that's already
// locked the lockedVar should cause a deadlock.
useLockedVar()
releaseLock()
}
func main() {
lock = semaphore.NewWeighted(1)
lockedVar = "this is the locked var"
// this is only on a separate goroutine so that the standard
// go "deadlock" message doesn't print out.
go causeDeadLock()
// Keep the primary goroutine active.
for true {
time.Sleep(time.Second)
}
}
Is there a way to get the acquireLock() function call to print a message after a timeout indicating that there is a potential deadlock but without unblocking the call? I would want the deadlock to persist, but a log message to be written in the event that a timeout is reached. So a TryAcquire isn't exactly what I want.
An example of what I want in psuedo code:
afterFiveSeconds := func() {
fmt.Printf("there is a potential deadlock\n")
}
lock.Acquire(context.TODO(), 1, afterFiveSeconds)
The lock.Acquire call in this example would call the afterFiveSeconds callback if the Acquire call blocked for more than 5 seconds, but it would not unblock the caller. It would continue to block.
I think I've found a solution to my problem.
func acquireLock() {
timeoutChan := make(chan bool)
go func() {
select {
case <-time.After(time.Second * time.Duration(5)):
fmt.Printf("potential deadlock while acquiring semaphore\n")
case <-timeoutChan:
break
}
}()
err := lock.Acquire(context.TODO(), 1)
close(timeoutChan)
if err != nil {
panic(err)
}
}

Keep a MQTT Go client running

I think it is a silly question, I need a MQTT Client to keep running after connection and subscription. I never encountered the problem because my MQTT clients are always coupled with an HTTP server, and when launching a HTTP server, the code don't stop running.
But in the present use case I only need a MQTT Client to subscribe to some topic and stay alive.
Here is what I do (the function just connect to a broker and subcribe to one topic.)
func main() {
godotenv.Load("./.env")
_initMqttConnection()
}
I need the client to stay connected and not stop just after the subscription is done.
How to perform that simple thing ?
Edit 1 : Complete Code
package main
import (
"encoding/json"
"fmt"
"log"
"net/http"
"os"
"path/filepath"
"strings"
"github.com/yosssi/gmq/mqtt"
"github.com/yosssi/gmq/mqtt/client"
"github.com/joho/godotenv"
"github.com/skratchdot/open-golang/open"
)
var cli *client.Client
func _initMqttConnection() {
cli = client.New(&client.Options{
ErrorHandler: func(err error) {
fmt.Println(err)
},
})
defer cli.Terminate()
log.Println("Connecting to " + os.Getenv("mqtt_host"))
err := cli.Connect(&client.ConnectOptions{
Network: "tcp",
Address: os.Getenv("mqtt_host"),
UserName: []byte(os.Getenv("mqtt_user")),
Password: []byte(os.Getenv("mqtt_password")),
ClientID: []byte("mqtt_video_launcher"),
})
if err != nil {
log.Println("Error 1")
panic(err)
}
log.Println("Connected to MQTT")
topic_to_sub := []byte("/" + os.Getenv("video_topic"))
err = cli.Subscribe(&client.SubscribeOptions{
SubReqs: []*client.SubReq{
&client.SubReq{
TopicFilter: topic_to_sub,
QoS: mqtt.QoS0,
Handler: func(topicName, message []byte) {
//do struff with message
fmt.Println(string(topicName), string(message))
},
},
},
})
if err != nil {
panic(err)
}
log.Println("Subscription OK : " + string(topic_to_sub[:len(topic_to_sub)]))
}
func main() {
godotenv.Load("./.env")
_initMqttConnection()
}
The temporary solution I use is adding :
http.ListenAndServe(":", nil)
at the end.
You have to make the program run infinitely or unless you want to explicitly end it (Cntrl c). One good solution that worked for me is to wait for a channel before exiting the main function and that channel can keep listening for an interrupt.
Eg:
func main() {
keepAlive := make(chan os.Signal)
signal.Notify(keepAlive, os.Interrupt, syscall.SIGTERM)
// All your code
<-keepAlive
}

Quit channel on a worker pool implementation

What I eventually want to accomplish is to dynamically scale my workers up OR down, depending on the workload.
The code below successfully parses data when a Task is coming through w.Channel
func (s *Storage) StartWorker(w *app.Worker) {
go func() {
for {
w.Pool <- w.Channel // register current worker to the worker pool
select {
case task := <-w.Channel: // received a work request, do some work
time.Sleep(task.Delay)
fmt.Println(w.WorkerID, "processing task:", task.TaskName)
w.Results <- s.ProcessTask(w, &task)
case <-w.Quit:
fmt.Println("Closing channel for", w.WorkerID)
return
}
}
}()
}
The blocking point here is the line below.
w.Pool <- w.Channel
In that sense, if I try to stop a worker(s) in any part of my program with:
w.Quit <- true
the case <-w.Quit: is blocked and never receives until there's another incoming Task on w.Channel (and I guess select statement here is random for each case selection).
So how can I stop a channel(worker) independently?
See below sample code, it declares a fanout function that is reponsible to size up/down the workers.
It works by using timeouts to detect that new workers has ended or are required to spawn.
there is an inner loop to ensure that each item is processed before moving on, blocking the source when it is needed.
package main
import (
"fmt"
"io"
"log"
"net"
"os"
)
func main() {
input := make(chan string)
fanout(input)
}
func fanout() {
workers := 0
distribute := make(chan string)
workerEnd := make(chan bool)
for i := range input {
done := false
for done {
select {
case distribute<-i:
done = true
case <-workerEnd:
workers--
default:
if workers <10 {
workers++
go func(){
work(distribute)
workerEnd<-true
}()
}
}
}
}
}
func work(input chan string) {
for {
select {
case i := <-input:
<-time.After(time.Millisecond)
case <-time.After(time.Second):
return
}
}
}

Golang listenUDP multiple ports blocking with BigTable connection

I'm creating a simple udp client that listens on multiple ports and saves the request to bigtable.
It's essential to listen on different ports before you ask.
Everything was working nicely until I included bigtable. After doing so, the listeners block completely.
My stripped down code, without bigtable, looks like this:
func flow(port string) {
protocol := "udp"
udpAddr, err := net.ResolveUDPAddr(protocol, "0.0.0.0:"+port)
if err != nil {
fmt.Println("Wrong Address")
return
}
udpConn, err := net.ListenUDP(protocol, udpAddr)
if err != nil {
fmt.Println(err)
}
defer udpConn.Close()
for {
Publish(udpConn, port)
}
}
func main() {
fmt.Print("Starting server.........")
for i := *Start; i <= *End; i++ {
x := strconv.Itoa(i)
go flow(x)
}
}
This works fine however, as soon as I add the following for bigtable, the whole thing blocks. If I remove the go routine that creates the listener (which means I can't listen on multiple ports) it works.
func createBigTable() {
ctx := context.Background()
client, err := bigtable.NewClient(ctx, *ProjectID, *Instance)
if err != nil {
log.Fatal("Bigtable NewClient:", err)
}
Table = client.Open("x")
}
I managed to get it working by adding a query in the createBigTable func but the program still blocks later on.
I have no idea if this is an issue with bigtable, grpc or just the way I'm doing it.
Would really appreciate some advise about how to fix.
--- UPDATE ---
I've discovered the issue isn't just with BigTable - I also have the same issue when I call gcloud pubsub.
--- UPDATE 2 ---
createBigtable is called in the init function (BEFORE THE MAIN FUNCTION):
func init() {
createBigTable
}
--- Update 3 ---
Output from sigquit can be found here:
https://pastebin.com/fzixqmiA
In your playground example, you're using for {} to keep the server running for forever.
This seems to deprive the goroutines from ever getting to run.
Try using e.g. a WaitGroup to yield control from the main() routine and let the flow() routines handle the incoming UDP packets.
import (
...
"sync"
...
)
...
func main() {
fmt.Print("Starting server.")
for i := *Start; i <= *End; i++ {
x := strconv.Itoa(i)
go flow(x)
}
var wg sync.WaitGroup
wg.Add(1)
wg.Wait()
}

Resources