I'm integrating Binance API into an existing system and while most parts a straight forward, the data streaming API hits my limited understanding of go-routines. I don't believe there is anything special in the golang SDK for Binance, but essentially I only need two functions, one that starts the data stream and processes events with the event handler given as a parameter and a second one that ends the data stream without actually shutting down the client as it would close all other connections. On a previous project, there were two message types for this, but the binance SDK uses an implementation that returns two go channels, one for errors and an another one, I guess from the name, for stopping the data stram.
The code I wrote for starting the data stream looks like this:
func startDataStream(symbol, interval string, wsKlineHandler futures.WsKlineHandler, errHandler futures.ErrHandler) (err error){
doneC, stopC, err := futures.WsKlineServe(symbol, interval, wsKlineHandler, errHandler)
if err != nil {
fmt.Println(err)
return err
}
return nil
}
This works as expected and streams data. A simple test verifies it:
func runWSDataTest() {
symbol := "BTCUSDT"
interval := "15m"
errHandler := func(err error) {fmt.Println(err)}
wsKlineHandler := func(event *futures.WsKlineEvent) {fmt.Println(event)}
_ = startDataStream(symbol, interval, wsKlineHandler, errHandler)
}
The thing that is not so clear to me, mainly due to incomplete understanding, really is how do I stop the stream. I think the returned stopC channel can be used to somehow issue a end singnal similar to, say, a sigterm on system level and then the stream should end.
Say, I have a stopDataStream function that takes a symbol as an argument
func stopDataStream(symbol){
}
Let's suppose I start 5 data streams for five symbols and now I want to stop just one of the streams. That begs the question of:
How do I track all those stopC channels?
Can I use a collection keyed with the symbol, pull the stopC channel, and then just issue a signal to end just that data stream?
How do I actually write into the stopC channel from the stop function?
Again, I don't think this is particularly hard, it's just I could not figure it out yet from the docs so any help would be appreciated.
Thank you
(Answer originally written by #Marvin.Hansen)
Turned out, just saving & closing the channel solved it all. I was really surprised how easy this is, but here is the code of the updated functions:
func startDataStream(symbol, interval string, wsKlineHandler futures.WsKlineHandler, errHandler futures.ErrHandler) (err error) {
_, stopC, err := futures.WsKlineServe(symbol, interval, wsKlineHandler, errHandler)
if err != nil {
fmt.Println(err)
return err
}
// just save the stop channel
chanMap[symbol] = stopC
return nil
}
And then, the stop function really becomes embarrassing trivial:
func stopDataStream(symbol string) {
stopC := chanMap[symbol] // load the stop channel for the symbol
close(stopC) // just close it.
}
Finally, testing it all out:
var (
chanMap map[string]chan struct{}
)
func runWSDataTest() {
chanMap = make(map[string]chan struct{})
symbol := "BTCUSDT"
interval := "15m"
errHandler := func(err error) { fmt.Println(err) }
wsKlineHandler := getKLineHandler()
println("Start stream")
_ = startDataStream(symbol, interval, wsKlineHandler, errHandler)
time.Sleep(3 * time.Second)
println("Stop stream")
stopDataStream(symbol)
time.Sleep(1 * time.Second)
}
This is it.
Related
I am connecting to a websocket that is stream live stock trades.
I have to read the prices, perform calculations on the fly and based on these calculations make another API call e.g. buy or sell.
I want to ensure my calculations/processing doesn't slow down my ability to stream in all the live data.
What is a good design pattern to follow for this type of problem?
Is there a way to log/warn in my system to know if I am falling behind?
Falling behind means: the websocket is sending price data, and I am not able to process that data as it comes in and it is lagging behind.
While doing the c.ReadJSON and then passing the message to my channel, there might be a delay in deserializing into JSON
When inside my channel and processing, calculating formulas and sending another API request to buy/sell, this will add delays
How can I prevent lags/delays and also monitor if indeed there is a delay?
func main() {
c, _, err := websocket.DefaultDialer.Dial("wss://socket.example.com/stocks", nil)
if err != nil {
panic(err)
}
defer c.Close()
// Buffered channel to account for bursts or spikes in data:
chanMessages := make(chan interface{}, 10000)
// Read messages off the buffered queue:
go func() {
for msgBytes := range chanMessages {
logrus.Info("Message Bytes: ", msgBytes)
}
}()
// As little logic as possible in the reader loop:
for {
var msg interface{}
err := c.ReadJSON(&msg)
if err != nil {
panic(err)
}
chanMessages <- msg
}
}
You can read bytes, pass them to the channel, and use other goroutines to do conversion.
I worked on a similar crypto market bot. Instead of creating large buffured channel i created buffered channel with cap of 1 and used select statement for sending socket data to channel.
Here is the example
var wg sync.WaitGroup
msg := make(chan []byte, 1)
wg.Add(1)
go func() {
defer wg.Done()
for data := range msg {
// decode and process data
}
}()
for {
_, data, err := c.ReadMessage()
if err != nil {
log.Println("read error: ", err)
return
}
select {
case msg <- data: // in case channel is free
default: // if not, next time will try again with latest data
}
}
This will insure that you'll get the latest data when you are ready to process.
This one is a tricky issue that bugs me quite a bit.
Essentially, I wrote an integration microservice that provides data streams from Binance crypto exchange using the Go client. A client sends a start messages, starts data stream for a symbol, and at some point, sends a close message to stop the stream. My implementation looks basically like this:
func (c BinanceClient) StartDataStream(clientType bn.ClientType, symbol, interval string) error {
switch clientType {
case bn.SPOT_LIVE:
wsKlineHandler := c.handlers.klineHandler.SpotKlineHandler
wsErrHandler := c.handlers.klineHandler.ErrHandler
_, stopC, err := binance.WsKlineServe(symbol, interval, wsKlineHandler, wsErrHandler)
if err != nil {
fmt.Println(err)
return err
} else {
c.state.clientSymChanMap[clientType][symbol] = stopC
return nil
}
...
}
The clientSymChanMap stores the stopChannel in a nested hashmap so that I can retrieve the stop channel later to stop the data feed. The stop function has been implemented accordingly:
func (c BinanceClient) StopDataStream(clientType bn.ClientType, symbol string) {
//mtd := "StopDataStream: "
stopC := c.state.clientSymChanMap[clientType][symbol]
if isClosed(stopC) {
DbgPrint(" Channel is already closed. Do nothing for: " + symbol)
} else {
close(stopC)
}
// Delete channel from the map otherwise the next StopAll throws a NPE due to closing a dead channel
delete(c.state.clientSymChanMap[clientType], symbol)
return
}
To prevent panics from already closed channels, I use a check function that returns true in case the channel is already close.
func isClosed(ch <-chan struct{}) bool {
select {
case <-ch:
return true
default:
}
return false
}
Looks nice, but has a catch. When I run the code with starting data for just one symbol, it starts and closes the datafeed exactly as expected.
However, when starting multiple data feeds, then the above code somehow never closes the websocket and just keeps streaming data forever. Without the isClosed check, I get panics of trying to close a closed channel, but with the check in place, well, nothing gets closed.
When looking at the implementation of the above binance.WsKlineServe function, it's quite obvious that it just wraps a new websocket with each invocation and then returns the done & stop channel.
The documentation gives the following usage example:
wsKlineHandler := func(event *binance.WsKlineEvent) {
fmt.Println(event)
}
errHandler := func(err error) {
fmt.Println(err)
}
doneC, stopC, err := binance.WsKlineServe("LTCBTC", "1m", wsKlineHandler, errHandler)
if err != nil {
fmt.Println(err)
return
}
<-doneC
Because the doneC channel actually blocks, I removed it and thought that storing the stopC channel and then use it later to stop the datafeed would work. However, it only does so for one single instance. When multiple streams are open, this doesn't work anymore.
Any idea what that's the case and how to fix it?
Firstly, this is dangerous:
if isClosed(stopC) {
DbgPrint(" Channel is already closed. Do nothing for: " + symbol)
} else {
close(stopC) // <- can't be sure channel is still open
}
there is no guarantee that after your polling check of the channel state, that the channel will still be in that same state in the next line of code. So this code could in theory could panic if it's called concurrently.
If you want an asynchronous action to occur on the channel close - it's best to do this explicitly from its own goroutine. So you could try this:
go func() {
stopC := c.state.clientSymChanMap[clientType][symbol]
<-stopC
// stopC definitely closed now
delete(c.state.clientSymChanMap[clientType], symbol)
}()
P.S. you do need some sort of mutex on your map, since the delete is asynchronous - you need to ensure any adds to the map don't datarace with this.
P.P.S Channels are reclaimed by the GC when they go out of scope. If you are no longer reading from it - they do not need to be explicitly closed to be reclaimed by the GC.
Using channels for stopping a goroutine or closing something is very tricky. There are lots of things you can do wrong or forget to do.
context.WithCancel abstracts that complexity away, making the code more readable and maintainable.
Some code snippets:
ctx, cancel := context.WitchCancel(context.TODO())
TheThingToCancel(ctx, ...)
// Whenever you want to stop TheThingToCancel. Can be called multiple times.
cancel()
Then in a for loop you'd often have a select like this:
for {
select {
case <-ctx.Done():
return
default:
}
// do stuff
}
Here some code that is closer to your specific case of an open connection:
func TheThingToCancel(ctx context.Context) (context.CancelFunc, error) {
ctx, cancel := context.WithCancel(ctx)
conn, err := net.Dial("tcp", ":12345")
if err != nil {
cancel()
return nil, err
}
go func() {
<-ctx.Done()
_ = conn.Close()
}()
go func() {
defer func() {
_ = conn.Close()
// make sure context is always cancelled to avoid goroutine leak
cancel()
}()
var bts = make([]byte, 1024)
for {
n, err := conn.Read(bts)
if err != nil {
return
}
fmt.Println(bts[:n])
}
}()
return cancel, nil
}
It returns the cancel function to be able to close it from the outside.
Cancelling a context can be done many times over without a panic like would occur if a channel is closed multiple times. That is one advantage. Also you can derive contexts from other contexts and thereby close a lot of contexts that all stop different routines by closing a parent context. Carefully designed, this is very powerful for shutting down different routines belonging together that also need to be able to be shut down individually.
I'm new to Go and concurrency in Go. I'm trying to use a Go context to cancel a set of Go routines once I find a member with a given ID.
A Group stores a list of Clients, and each Client has a list of Members. I want to search in parallel all the Clients and all their Members to find a Member with a given ID. Once this Member is found, I want to cancel all the other Go routines and return the discovered Member.
I've tried the following implementation, using a context.WithCancel and a WaitGroup.
This doesn't work however, and hangs indefinitely, never getting past the line waitGroup.Wait(), but I'm not sure why exactly.
func (group *Group) MemberWithID(ID string) (*models.Member, error) {
found := make(chan *models.Member)
ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
defer cancel()
var waitGroup sync.WaitGroup
for _, client := range group.Clients {
waitGroup.Add(1)
go func(clientToQuery Client) {
defer waitGroup.Done()
select {
case <-ctx.Done():
return
default:
}
member, _ := client.ClientMemberWithID(ID)
if member != nil {
found <- member
cancel()
return
}
} (client)
}
waitGroup.Wait()
if len(found) > 0 {
return <-found, nil
}
return nil, fmt.Errorf("no member found with given id")
}
found is an unbuffered channel, so sending on it blocks until there is someone ready to receive from it.
Your main() function would be the one to receive from it, but only after waitGroup.Wait() returns. But that will block until all launched goroutines call waitGroup.Done(). But that won't happen until they return, which won't happen until they can send on found. It's a deadlock.
If you change found to be buffered, that will allow sending values on it even if main() is not ready to receive from it (as many values as big the buffer is).
But you should receive from found before waitGroup.Wait() returns.
Another solution is to use a buffer of 1 for found, and use non-blocking send on found. That way the first (fastest) goroutine will be able to send the result, and the rest (given we're using non-blocking send) will simply skip sending.
Also note that it should be the main() that calls cancel(), not each launched goroutines individually.
For this kind of use case I think a sync.Once is probably a better fit than a channel. When you find the first non-nil member, you want to do two different things:
Record the member you found.
Cancel the remaining goroutines.
A buffered channel can easily do (1), but makes (2) a bit more complicated. But a sync.Once is perfect for doing two different things the first time something interesting happens!
I would also suggest aggregating non-trivial errors, so that you can report something more useful than no member found if, say, your database connection fails or some other nontrivial error occurs. You can use a sync.Once for that, too!
Putting it all together, I would want to see something like this (https://play.golang.org/p/QZXUUnbxOv5):
func (group *Group) MemberWithID(ctx context.Context, id string) (*Member, error) {
ctx, cancel := context.WithCancel(ctx)
defer cancel()
var (
wg sync.WaitGroup
member *Member
foundOnce sync.Once
firstNontrivialErr error
errOnce sync.Once
)
for _, client := range group.Clients {
wg.Add(1)
client := client // https://golang.org/doc/faq#closures_and_goroutines
go func() {
defer wg.Done()
m, err := client.ClientMemberWithID(ctx, id)
if m != nil {
foundOnce.Do(func() {
member = m
cancel()
})
} else if nf := (*MemberNotFoundError)(nil); !errors.As(err, &nf) {
errOnce.Do(func() {
firstNontrivialErr = err
})
}
}()
}
wg.Wait()
if member == nil {
if firstNontrivialErr != nil {
return nil, firstNontrivialErr
}
return nil, &MemberNotFoundError{ID: id}
}
return member, nil
}
I'm using channels in Go to process a data pipeline of sorts. The code looks something like this:
type Channels struct {
inputs chan string
errc chan error
quit chan struct{}
}
func (c *Channels) doSomethingWithInput() {
defer close(c.quit)
defer close(c.errc)
for input := range p.inputs {
_, err := doSomethingThatSometimesErrors(input)
if err != nil {
c.errc <- err
return
}
}
doOneFinalThingThatCannotError()
return
}
func (c *Channels) inputData(s string) {
// This function implementation is my question
}
func StartProcessing(c *Channels, data ...string) error {
go c.doSomethingWithInput()
go func() {
defer close(c.inputs)
for _, i := range data {
select {
case <-c.quit:
break
default:
}
inputData(i)
}
}()
// Block until the quit channel is closed.
<-c.quit
if err := <-c.errc; err != nil {
return err
}
return nil
}
This seems like a reasonable way to communicate a quit signal between channel processors and is based on this blog post about concurrency patterns in Go.
The thing I struggle with using this pattern is the inputData function. Adding strings to the input channel needs to wait for doSomethingWithInput() to read the channel, but it also might error. inputData needs to try and feed the inputs channel but give up if told to quit. The best I could do was this:
func (c *Channels) inputData(s string) {
for {
select {
case <-c.quit:
return
case c.inputs <- s:
return
}
}
}
Essentially, "oscillate between your options until one of them sticks." To be clear, I don't think it's a bad design. It just feels... wasteful. Like I'm missing something clever. How can I tell a channel sender to quit in Go when a channel consumer errors?
Your inputData() is fine, that's the way to do it.
In your use case, your channel consumer, the receiver, aka doSomethingWithInput() is the one which should have control over the "quit" channel. As it is, if an error occurs, just return from doSomethingWithInput(), which will in turn close the quit channel and make the sender(s) quit (will trigger case <-quit:). That is in fact the clever bit.
Just watch out with your error channel that's not buffered and closed when doSomethingWithInput() exits. You cannot read it afterwards to collect errors. You need to close it in your main function and initialize it with some capacity (make(chan int, 10) for example), or create a consumer goroutine for it. You may also want to try reading it with a select statement: your error checking code, as it is, will block forever if there are no errors.
The problem is this: There is a web server. I figured that it would be beneficial to use goroutines in page loading, so I went ahead and did: called loadPage function as a goroutine. However, when doing this, the server simply stops working without errors. It prints a blank, white page. The problem has to be in the function itself- something there is conflicting with the goroutine somehow.
These are the relevant functions:
func loadPage(w http.ResponseWriter, path string) {
s := GetFileContent(path)
w.Header().Add("Content-Type", getHeader(path))
w.Header().Add("Content-Length", GetContentLength(path))
fmt.Fprint(w, s)
}
func GetFileContent(path string) string {
cont, err := ioutil.ReadFile(path)
e(err)
aob := len(cont)
s := string(cont[:aob])
return s
}
func GetFileContent(path string) string {
cont, err := ioutil.ReadFile(path)
e(err)
aob := len(cont)
s := string(cont[:aob])
return s
}
func getHeader(path string) string {
images := []string{".jpg", ".jpeg", ".gif", ".png"}
readable := []string{".htm", ".html", ".php", ".asp", ".js", ".css"}
if ArrayContainsSuffix(images, path) {
return "image/jpeg"
}
if ArrayContainsSuffix(readable, path) {
return "text/html"
}
return "file/downloadable"
}
func ArrayContainsSuffix(arr []string, c string) bool {
length := len(arr)
for i := 0; i < length; i++ {
s := arr[i]
if strings.HasSuffix(c, s) {
return true
}
}
return false
}
The reason why this happens is because your HandlerFunc which calls "loadPage" is called synchronously with the request. When you call it in a go routine the Handler is actually returning immediately, causing the response to be sent immediately. That's why you get a blank page.
You can see this in server.go (line 1096):
serverHandler{c.server}.ServeHTTP(w, w.req)
if c.hijacked() {
return
}
w.finishRequest()
The ServeHTTP function calls your handler, and as soon as it returns it calls "finishRequest". So your Handler function must block as long as it wants to fulfill the request.
Using a go routine will actually not make your page any faster. Synchronizing a singe go routine with a channel, as Philip suggests, will also not help you in this case as that would be the same as not having the go routine at all.
The root of your problem is actually ioutil.ReadFile, which buffers the entire file into memory before sending it.
If you want to stream the file you need to use os.Open. You can use io.Copy to stream the contents of the file to the browser, which will used chunked encoding.
That would look something like this:
f, err := os.Open(path)
if err != nil {
http.Error(w, "Not Found", http.StatusNotFound)
return
}
n, err := io.Copy(w, f)
if n == 0 && err != nil {
http.Error(w, "Error", http.StatusInternalServerError)
return
}
If for some reason you need to do work in multiple go routines, take a look at sync.WaitGroup. Channels can also work.
If you are trying to just serve a file, there are other options that are optimized for this, such as FileServer or ServeFile.
In the typical web framework implementations in Go, the route handlers are invoked as Goroutines. I.e. at some point the web framework will say go loadPage(...).
So if you call a Go routine from inside loadPage, you have two levels of Goroutines.
The Go scheduler is really lazy and will not execute the second level if it's not forced to. So you need to enforce it through synchronization events. E.g. by using channels or the sync package. Example:
func loadPage(w http.ResponseWriter, path string) {
s := make(chan string)
go GetFileContent(path, s)
fmt.Fprint(w, <-s)
}
The Go documentation says this:
If the effects of a goroutine must be observed by another goroutine,
use a synchronization mechanism such as a lock or channel
communication to establish a relative ordering.
Why is this actually a smart thing to do? In larger projects you may deal with a large number of Goroutines that need to be coordinated somehow efficiently. So why call a Goroutine if it's output is used nowhere? A fun fact: I/O operations like fmt.Printf do trigger synchronization events too.