golang goroutine practice, function or channel? - go

constantly receive Json data from websocket and process them in goroutine, no idea is this writing pattern is encourage or not
ws.onmessage { //infinite receive message from websocket
go func() { //work find using this goroutine
defer processJson(message)
}()
go processJson(message) //error and program will terminated
}
func processJson(msg string) {
//code for process json
insertDatabase(processedMsg)
}
func insertDatabase(processedMsg string) {
//code insert to database
}
Below(the first goroutine) work just fine, but sometime(a week) indicates there is a data race in the code and terminate the program.
go func() {
defer processJson(message)
}()
the second goroutine, often encounter error after few minutes running, the error often is "fatal error: unexpected signal during runtime execution".
go processJson(message)
from my understanding both goroutine do the samething, why is that the first can run well and second cannot. i have try using channel, but not much difference compare to the first goroutine.
msgChan := make(chan string, 1000)
go processJson(msgChan)
for { //receive json from websocket, send to channel
msgChan <- message
}
func JsonProcessor(msg chan string) {
for { //get data from channel, process in goroutine function
msgModified := <-msg
insertDatabase(msgModified)
}
}
is there any encourage way to acheive the goal without data race, suggestions are welcome.
Appreciate and Thanks.

try to use sync.Mutex avoid data racing
mutux := sync.Mutex{}
ws.onmessage {
processJson(message)
}
func processJson(msg string) {
mutux.Lock()
// .........
mutux.Unlock()
}
if the processing function can be divided without data racing, multithread version as follows :
msgChan1 := make(chan string, 1000)
msgChan2 := make(chan string, 1000)
go func() {
for m := range msgChan1 {
// ...
}
}()
go func() {
for m := range msgChan2 {
// ...
}
}()
ws.onmessage {
msgChan1 <- message
msgChan2 <- message
}
ws.onclose {
close(msgChan1)
close(msgChan2)
}

Related

Golang Server Sent Events Per User

I've been working with Go for some time but never done SSE before. I'm having an issue, can someone PLEASE provide with a working example of server sent events that will only send to a specific user(connection).
I'm using a gorilla - sessions to authenticate and I would like to use UserID to separate connections.
Or should I use 5 second polling via Ajax?
Many thanks
Here is what i found and tried:
https://gist.github.com/ismasan/3fb75381cd2deb6bfa9c it doenst send to an individual user and the go func wont stop if the connection is closed
https://github.com/striversity/gotr/blob/master/010-server-sent-event-part-2/main.go this is kind of what i need but it doesnt track once the connection is removed. So now, once you close and open the browser in private window it's not working at all. Also, as above, the go routine keeps going.
Create a "broker" to distribute messages to connected users:
type Broker struct {
// users is a map where the key is the user id
// and the value is a slice of channels to connections
// for that user id
users map[string][]chan []byte
// actions is a channel of functions to call
// in the broker's goroutine. The broker executes
// everything in that single goroutine to avoid
// data races.
actions chan func()
}
// run executes in a goroutine. It simply gets and
// calls functions.
func (b *Broker) run() {
for a := range b.actions {
a()
}
}
func newBroker() *Broker {
b := &Broker{
users: make(map[string][]chan []byte),
actions: make(chan func()),
}
go b.run()
return b
}
// addUserChan adds a channel for user with given id.
func (b *Broker) addUserChan(id string, ch chan []byte) {
b.actions <- func() {
b.users[id] = append(b.users[id], ch)
}
}
// removeUserchan removes a channel for a user with the given id.
func (b *Broker) removeUserChan(id string, ch chan []byte) {
// The broker may be trying to send to
// ch, but nothing is receiving. Pump ch
// to prevent broker from getting stuck.
go func() { for range ch {} }()
b.actions <- func() {
chs := b.users[id]
i := 0
for _, c := range chs {
if c != ch {
chs[i] = c
i = i + 1
}
}
if i == 0 {
delete(b.users, id)
} else {
b.users[id] = chs[:i]
}
// Close channel to break loop at beginning
// of removeUserChan.
// This must be done in broker goroutine
// to ensure that broker does not send to
// closed goroutine.
close(ch)
}
}
// sendToUser sends a message to all channels for the given user id.
func (b *Broker) sendToUser(id string, data []byte) {
b.actions <- func() {
for _, ch := range b.users[id] {
ch <- data
}
}
}
Declare a variable with the broker at package-level:
var broker = newBroker()
Write the SSE endpoint using the broker:
func sseEndpoint(w http.ResponseWriter, r *http.Request) {
// I assume that user id is in query string for this example,
// You should use your authentication code to get the id.
id := r.FormValue("id")
// Do the usual SSE setup.
flusher := w.(http.Flusher)
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Connection", "keep-alive")
// Create channel to receive messages for this connection.
// Register that channel with the broker.
// On return from the function, remove the channel
// from the broker.
ch := make(chan []byte)
broker.addUserChan(id, ch)
defer broker.removeUserChan(id, ch)
for {
select {
case <-r.Context().Done():
// User closed the connection. We are out of here.
return
case m := <-ch:
// We got a message. Do the usual SSE stuff.
fmt.Fprintf(w, "data: %s\n\n", m)
flusher.Flush()
}
}
}
Add code to your application to call Broker.sendToUser.

How to handle multiple goroutines that share the same channel

I've been searching a lot but could not find an answer for my problem yet.
I need to make multiple calls to an external API, but with different parameters concurrently.
And then for each call I need to init a struct for each dataset and process the data I receive from the API call. Bear in mind that I read each line of the incoming request and start immediately send it to the channel.
First problem I encounter was not obvious at the beginning due to the large quantity of data I'm receiving, is that each goroutine does not receive all the data that goes through the channel. (Which I learned by the research I've made). So what I need is a way of requeuing/redirect that data to the correct goroutine.
The function that sends the streamed response from a single dataset.
(I've cut useless parts of code that are out of context)
func (api *API) RequestData(ctx context.Context, c chan DWeatherResponse, dataset string, wg *sync.WaitGroup) error {
for {
line, err := reader.ReadBytes('\n')
s := string(line)
if err != nil {
log.Println("End of %s", dataset)
return err
}
data, err := extractDataFromStreamLine(s, dataset)
if err != nil {
continue
}
c <- *data
}
}
The function that will process the incoming data
func (s *StrikeStruct) Process(ch, requeue chan dweather.DWeatherResponse) {
for {
data, more := <-ch
if !more {
break
}
// data contains {dataset string, value float64, date time.Time}
// The s.Parameter needs to match the dataset
// IMPORTANT PART, checks if the received data is part of this struct dataset
// If not I want to send it to another go routine until it gets to the correct
one. There will be a max of 4 datasets but still this could not be the best approach to have
if !api.GetDataset(s.Parameter, data.Dataset) {
requeue <- data
continue
}
// Do stuff with the data from this point
}
}
Now on my own API endpoint I have the following:
ch := make(chan dweather.DWeatherResponse, 2)
requeue := make(chan dweather.DWeatherResponse)
final := make(chan strike.StrikePerYearResponse)
var wg sync.WaitGroup
for _, s := range args.Parameters.Strikes {
strike := strike.StrikePerYear{
Parameter: strike.Parameter(s.Dataset),
StrikeValue: s.Value,
}
// I receive and process the data in here
go strike.ProcessStrikePerYear(ch, requeue, final, string(s.Dataset))
}
go func() {
for {
data, _ := <-requeue
ch <- data
}
}()
// Creates a goroutine for each dataset
for _, dataset := range api.Params.Dataset {
wg.Add(1)
go api.RequestData(ctx, ch, dataset, &wg)
}
wg.Wait()
close(ch)
//Once the data is all processed it is all appended
var strikes []strike.StrikePerYearResponse
for range args.Fetch.Datasets {
strikes = append(strikes, <-final)
}
return strikes
The issue with this code is that as soon as I start receiving data from more than one endpoint the requeue will block and nothing more happens. If I remove that requeue logic data will be lost if it does not land on the correct goroutine.
My two questions are:
Why is the requeue blocking if it has a goroutine always ready to receive?
Should I take a different approach on how I'm processing the incoming data?
this is not a good way to solving your problem. you should change your solution. I suggest an implementation like the below:
import (
"fmt"
"sync"
)
// answer for https://stackoverflow.com/questions/68454226/how-to-handle-multiple-goroutines-that-share-the-same-channel
var (
finalResult = make(chan string)
)
// IData use for message dispatcher that all struct must implement its method
type IData interface {
IsThisForMe() bool
Process(*sync.WaitGroup)
}
//MainData can be your main struct like StrikePerYear
type MainData struct {
// add any props
Id int
Name string
}
type DataTyp1 struct {
MainData *MainData
}
func (d DataTyp1) IsThisForMe() bool {
// you can check your condition here to checking incoming data
if d.MainData.Id == 2 {
return true
}
return false
}
func (d DataTyp1) Process(wg *sync.WaitGroup) {
d.MainData.Name = "processed by DataTyp1"
// send result to final channel, you can change it as you want
finalResult <- d.MainData.Name
wg.Done()
}
type DataTyp2 struct {
MainData *MainData
}
func (d DataTyp2) IsThisForMe() bool {
// you can check your condition here to checking incoming data
if d.MainData.Id == 3 {
return true
}
return false
}
func (d DataTyp2) Process(wg *sync.WaitGroup) {
d.MainData.Name = "processed by DataTyp2"
// send result to final channel, you can change it as you want
finalResult <- d.MainData.Name
wg.Done()
}
//dispatcher will run new go routine for each request.
//you can implement a worker pool to preventing running too many go routines.
func dispatcher(incomingData *MainData, wg *sync.WaitGroup) {
// based on your requirements you can remove this go routing or not
go func() {
var p IData
p = DataTyp1{incomingData}
if p.IsThisForMe() {
go p.Process(wg)
return
}
p = DataTyp2{incomingData}
if p.IsThisForMe() {
go p.Process(wg)
return
}
}()
}
func main() {
dummyDataArray := []MainData{
MainData{Id: 2, Name: "this data #2"},
MainData{Id: 3, Name: "this data #3"},
}
wg := sync.WaitGroup{}
for i := range dummyDataArray {
wg.Add(1)
dispatcher(&dummyDataArray[i], &wg)
}
result := make([]string, 0)
done := make(chan struct{})
// data collector
go func() {
loop:for {
select {
case <-done:
break loop
case r := <-finalResult:
result = append(result, r)
}
}
}()
wg.Wait()
done<- struct{}{}
for _, s := range result {
fmt.Println(s)
}
}
Note: this is just for opening your mind for finding a better solution, and for sure this is not a production-ready code.

Nested errgroup inside bunch of goroutines

I am fairly new to golang and its concurrency principles. My use-case involves performing multiple http requests(for a single entity), on batch of entities. If any of the http request fails for an entity, I need to stop all parallel http requests for it. Also, I have to manage counts of entities failed with errors. I am trying to implement errorgroup inside entities goroutines, such that if any http request fails for a single entity the errorgroup terminates and return error to its parent goroutine. But I am not sure how to maintain count of errors.
func main(entity[] string) {
errorC := make(chan string) // channel to insert failed entity
var wg sync.WaitGroup
for _, link := range entity {
wg.Add(1)
// Spawn errorgroup here. errorgroup_spawn
}
go func() {
wg.Wait()
close(errorC)
}()
for msg := range errorC {
// here storing error entityIds somewhere.
}
}
and errorgroup like this
func errorgroup_spawn(ctx context.Context, errorC chan string, wg *sync.WaitGroup) { // and other params
defer (*wg).Done()
goRoutineCollection, ctxx := errgroup.WithContext(ctx)
results := make(chan *result)
goRoutineCollection.Go(func() error {
// http calls for single entity
// if error occurs, push it in errorC, and return Error.
return nil
})
go func() {
goRoutineCollection.Wait()
close(result)
}()
return goRoutineCollection.Wait()
}
PS: I was also thinking to apply nested errorgroups, but can't think to maintain error counts, while running other errorgroups
Can anyone guide me, is this a correct approach to handle such real world scenarios?
One way to keep track of errors is to use a status struct to keep track of which error came from where:
type Status struct {
Entity string
Err error
}
...
errorC := make(chan Status)
// Spawn error groups with name of the entity, and when error happens, push Status{Entity:entityName,Err:err} to the chanel
You can then read all errors from the error channel and figure out what failed why.
Another option is not to use errorgroups at all. This makes things more explicit, but whether it is better or not is debatable:
// Keep entity statuses
statuses:=make([]Status,len(entity))
for i, link := range entity {
statuses[i].Entity=link
wg.Add(1)
go func(i index) {
defer wg.Done()
ctx, cancel:=context.WithCancel(context.Background())
defer cancel()
// Error collector
status:=make(chan error)
defer close(status)
go func() {
for st:=range status {
if st!=nil {
cancel() // Stop all calls
// store first error
if statuses[i].Err==nil {
statuses[i].Err=st
}
}
}
}()
innerWg:=sync.WaitGroup{}
innerWg.Add(1)
go func() {
defer innerWg.Done()
status<- makeHttpCall(ctx)
}()
innerWg.Add(1)
go func() {
defer innerWg.Done()
status<- makeHttpCall(ctx)
}()
...
innerWg.Wait()
}(i)
}
When everything is done, statuses will contain all entities and corresponding statuses.

Check if someone has read from go channel

How we can set something like listener on go channels that when someone has read something from the channel, that notify us?
Imagine we have a sequence number for channel entries and we wanna decrement it when someone had read a value from our channel somewhere out of our package.
Unbuffered channels hand off data synchronously, so you already know when the data is read. Buffered channels work similarly when the buffer is full, but otherwise they don't block the same, so this approach wouldn't tell you quite the same thing. Depending on what your needs really are, consider also using tools like sync.WaitGroup.
ch = make(chan Data)
⋮
for {
⋮
// make data available
ch <- data
// now you know it was read
sequenceNumber--
⋮
}
You could create a channel relay mechanism, to capture read events in realtime.
So for example:
func relayer(in <-chan MyStruct) <-chan MyStruct {
out := make(chan MyStruct) // non-buffered chan (see below)
go func() {
defer close(out)
readCountLimit := 10
for item := range in {
out <- item
// ^^^^ so this will block until some worker has read from 'out'
readCountLimit--
}
}()
return out
}
Usage:
type MyStruct struct {
// put your data fields here
}
ch := make(chan MyStruct) // <- original channel - used by producer to write to
rch := relayer(ch) // <- relay channel - used to read from
// consumers
go worker("worker 1", rch)
go worker("worker 2", rch)
// producer
for { ch <- MyStruct{} }
You can do it in manual mode. implement some sort of ACK marker to the message.
Something like this:
type Msg struct {
Data int
ack bool
}
func (m *Msg) Ack() {
m.ack = true
}
func (m *Msg) Acked() bool {
return m.ack
}
func main() {
ch := make(chan *Msg)
msg := &Msg{Data: 1}
go func() {
for {
if msg.Acked() {
// do smth
}
time.Sleep(10 * time.Second)
}
}()
ch <- msg
for msg := range ch {
msg.Ack()
}
}
Code not tested.
You can also add some additional information to Ack() method, say meta information about package and func, from where Ack() was called, this answer may be related: https://stackoverflow.com/a/35213181/3782382

Go: one channel with multiple listeners

I'm pretty new to Go so sorry if the topic is wrong but I hope you understand my question. I want to process events to different go routines via a channel. Here is some sample code
type Event struct {
Host string
Command string
Output string
}
var (
incoming = make(chan Event)
)
func processEmail(ticker* time.Ticker) {
for {
select {
case t := <-ticker.C:
fmt.Println("Email Tick at", t)
case e := <-incoming:
fmt.Println("EMAIL GOT AN EVENT!")
fmt.Println(e)
}
}
}
func processPagerDuty(ticker* time.Ticker) {
for {
select {
case t := <-ticker.C:
fmt.Println("Pagerduty Tick at", t)
case e := <-incoming:
fmt.Println("PAGERDUTY GOT AN EVENT!")
fmt.Println(e)
}
}
}
func main() {
err := gcfg.ReadFileInto(&cfg, "dispatch-api.cfg")
if err != nil {
fmt.Printf("Error loading the config")
}
ticker := time.NewTicker(time.Second * 10)
go processEmail(ticker)
ticker := time.NewTicker(time.Second * 1)
go processPagerDuty(ticker)
}
func eventAdd(r render.Render, params martini.Params, req *http.Request) {
// create an event now
e := Event{Host: "web01-east.domain.com", Command: "foo", Output: "bar"}
incoming <- e
}
So the ticker events work just create. When I issue an API call to create an event I just get output from the processEmail function. Its whatever go routine is called first will get the event over the channel.
Is there a way for both functions to get that event?
You can use fan in and fan out (from Rob Pike's speech):
package main
func main() {
// feeders - feeder1, feeder2 and feeder3 are used to fan in
// data into one channel
go func() {
for {
select {
case v1 := <-feeder1:
mainChannel <- v1
case v2 := <-feeder2:
mainChannel <- v2
case v3 := <-feeder3:
mainChannel <- v3
}
}
}()
// dispatchers - not actually fan out rather dispatching data
go func() {
for {
v := <-mainChannel
// use this to prevent leaking goroutines
// (i.e. when one consumer got stuck)
done := make(chan bool)
go func() {
consumer1 <- v
done <- true
}()
go func() {
consumer2 <- v
done <- true
}()
go func() {
consumer3 <- v
done <- true
}()
<-done
<-done
<-done
}
}()
// or fan out (when processing the data by just one consumer is enough)
go func() {
for {
v := <-mainChannel
select {
case consumer1 <- v:
case consumer2 <- v:
case consumer3 <- v:
}
}
}()
// consumers(your logic)
go func() { <-consumer1 /* using the value */ }()
go func() { <-consumer2 /* using the value */ }()
go func() { <-consumer3 /* using the value */ }()
}
type payload int
var (
feeder1 = make(chan payload)
feeder2 = make(chan payload)
feeder3 = make(chan payload)
mainChannel = make(chan payload)
consumer1 = make(chan payload)
consumer2 = make(chan payload)
consumer3 = make(chan payload)
)
Channels are a point to point communication method, not a broadcast communication method, so no, you can't get both functions to get the event without doing something special.
You could have separate channels for both goroutines and send the message into each. This is probably the simplest solution.
Or alternatively you could get one goroutine to signal the next one.
Go has two mechanisms for doing broadcast signalling as far as I know. One is closing a channel. This only works a single time though.
The other is to use a sync.Cond lock. These are moderately tricky to use, but will allow you to have multiple goroutines woken up by a single event.
If I was you, I'd go for the first option, send the event to two different channels. That seems to map the problem quite well.

Resources