Why does this OrDone Channel Implementation receive twice from Done Channel? - go

Been reading through Concurrency in Go and it introduces a handy "or-done" channel.
TLDR; when you're working with a channel that you're not in control of (presumably from some other part of your system), the code can get a little ugly.
// Quite nice to read
for v := range myChan {
... Do Stuff
}
// Not so nice
loop:
for {
select {
case <- done:
break loop
case maybeVal, ok := <-myChan:
if !ok {
return
}
// Do something with maybeVal
}
}
The book offers a way to simplify this with the OrDone channel. Defined as follows. What I don't understand is why, in the nested select, we need to again receive from <-done.
orDone := func(done, c <-chan interface{}) <-chan interface{} {
valStream := make(chan interface{})
go func() {
defer close(valStream)
for {
select {
case <- done:
return
case v, ok := <-c:
if !ok {
return
}
select {
case valStream <- v:
case <-done: // Why do we also need to receive on done here?
}
}
}
}
return valStream
}
This allows you to go back to your original for loop, enhancing readability - like so:
for val := range orDone(done, myChan) {
// Once again, do something
}

Really just adding visibility to Peter's answer.
It's because valStream itself may block the send if whoever is receiving valStream loses interest.

In a situation where you have received a value from channel c you enter into your nested select{}. At this point if we took out the second <-done you would have this;
select {
case valStream: <- v:
}
This will block indefinitely until a value is received from channel v even if the done channel is closed. By adding a nested check, we allow ourselves to exit the select{} at either point.

An additional question, why is the body of that "case <-done:" empty?
orDone := func(done, c <-chan interface{}) <-chan interface{} {
valStream := make(chan interface{})
go func() {
defer close(valStream)
for {
select {
case <- done:
return
case v, ok := <-c:
if !ok {
return
}
select {
case valStream <- v:
case <-done:
return // I think there should be a "return" here
}
}
}
}
return valStream
}

Related

How to use multiple expressions in single case statement?

In the below code:
package main
import (
"fmt"
"reflect"
)
type Model1 struct {
ID string
}
type Model2 struct {
ID string
}
func main() {
ch1 := make(chan Model1)
close(ch1)
checkIfChannelClosed(ch1)
ch2 := make(chan Model2)
close(ch2)
checkIfChannelClosed(ch2)
}
func checkIfChannelClosed(ch interface{}) bool {
if reflect.TypeOf(ch).Kind() != reflect.Chan {
fmt.Println("only channels can be closed")
return false
}
ok := true
if ch == nil {
return false
}
switch v := ch.(type) {
case chan Model1:
select {
case _, ok = <-v: // Line 26
default:
}
case chan Model2:
select {
case _, ok = <-v:
default:
}
default:
fmt.Println("Invalid case")
}
if ok {
fmt.Println("channel is open")
} else {
fmt.Println("channel is closed")
}
return ok
}
GoLang compiler does not allow to write multiple expressions in case statement(as shown below). Goal is to avoid redundant code for select:
switch v := ch.(type) {
case chan Model1, chan Model2:
select {
case _, ok = <-v:
default:
}
default:
fmt.Println("Invalid case")
}
How to use multiple expressions with case statement?
I read this in "The Go programming language" chapter 7.13:
In this style, the emphasis is on the concrete types that satisfy the interface, not on the interface’s methods (if indeed it has any),and there is no hiding of information.
So, i think x.(Type) return a concrete type,if you use a multicase in a swith x:=x.(Type), what happend in the follow code?
switch v := ch.(type) {
case chan Model1,int:
//do something
}
just use the reflect.value to do this:
func checkIfChannelClosed(ch interface{}) bool {
v := reflect.ValueOf(ch)
if v.Kind() != reflect.Chan {
fmt.Println("only channels can be closed")
return false
}
_, ok := v.TryRecv()
if ok{
fmt.Println("recv value from channel..")
}else{
fmt.Println("channel is closed or receive cannot finish without blocking")
}
return ok
}

How to handle multiple goroutines that share the same channel

I've been searching a lot but could not find an answer for my problem yet.
I need to make multiple calls to an external API, but with different parameters concurrently.
And then for each call I need to init a struct for each dataset and process the data I receive from the API call. Bear in mind that I read each line of the incoming request and start immediately send it to the channel.
First problem I encounter was not obvious at the beginning due to the large quantity of data I'm receiving, is that each goroutine does not receive all the data that goes through the channel. (Which I learned by the research I've made). So what I need is a way of requeuing/redirect that data to the correct goroutine.
The function that sends the streamed response from a single dataset.
(I've cut useless parts of code that are out of context)
func (api *API) RequestData(ctx context.Context, c chan DWeatherResponse, dataset string, wg *sync.WaitGroup) error {
for {
line, err := reader.ReadBytes('\n')
s := string(line)
if err != nil {
log.Println("End of %s", dataset)
return err
}
data, err := extractDataFromStreamLine(s, dataset)
if err != nil {
continue
}
c <- *data
}
}
The function that will process the incoming data
func (s *StrikeStruct) Process(ch, requeue chan dweather.DWeatherResponse) {
for {
data, more := <-ch
if !more {
break
}
// data contains {dataset string, value float64, date time.Time}
// The s.Parameter needs to match the dataset
// IMPORTANT PART, checks if the received data is part of this struct dataset
// If not I want to send it to another go routine until it gets to the correct
one. There will be a max of 4 datasets but still this could not be the best approach to have
if !api.GetDataset(s.Parameter, data.Dataset) {
requeue <- data
continue
}
// Do stuff with the data from this point
}
}
Now on my own API endpoint I have the following:
ch := make(chan dweather.DWeatherResponse, 2)
requeue := make(chan dweather.DWeatherResponse)
final := make(chan strike.StrikePerYearResponse)
var wg sync.WaitGroup
for _, s := range args.Parameters.Strikes {
strike := strike.StrikePerYear{
Parameter: strike.Parameter(s.Dataset),
StrikeValue: s.Value,
}
// I receive and process the data in here
go strike.ProcessStrikePerYear(ch, requeue, final, string(s.Dataset))
}
go func() {
for {
data, _ := <-requeue
ch <- data
}
}()
// Creates a goroutine for each dataset
for _, dataset := range api.Params.Dataset {
wg.Add(1)
go api.RequestData(ctx, ch, dataset, &wg)
}
wg.Wait()
close(ch)
//Once the data is all processed it is all appended
var strikes []strike.StrikePerYearResponse
for range args.Fetch.Datasets {
strikes = append(strikes, <-final)
}
return strikes
The issue with this code is that as soon as I start receiving data from more than one endpoint the requeue will block and nothing more happens. If I remove that requeue logic data will be lost if it does not land on the correct goroutine.
My two questions are:
Why is the requeue blocking if it has a goroutine always ready to receive?
Should I take a different approach on how I'm processing the incoming data?
this is not a good way to solving your problem. you should change your solution. I suggest an implementation like the below:
import (
"fmt"
"sync"
)
// answer for https://stackoverflow.com/questions/68454226/how-to-handle-multiple-goroutines-that-share-the-same-channel
var (
finalResult = make(chan string)
)
// IData use for message dispatcher that all struct must implement its method
type IData interface {
IsThisForMe() bool
Process(*sync.WaitGroup)
}
//MainData can be your main struct like StrikePerYear
type MainData struct {
// add any props
Id int
Name string
}
type DataTyp1 struct {
MainData *MainData
}
func (d DataTyp1) IsThisForMe() bool {
// you can check your condition here to checking incoming data
if d.MainData.Id == 2 {
return true
}
return false
}
func (d DataTyp1) Process(wg *sync.WaitGroup) {
d.MainData.Name = "processed by DataTyp1"
// send result to final channel, you can change it as you want
finalResult <- d.MainData.Name
wg.Done()
}
type DataTyp2 struct {
MainData *MainData
}
func (d DataTyp2) IsThisForMe() bool {
// you can check your condition here to checking incoming data
if d.MainData.Id == 3 {
return true
}
return false
}
func (d DataTyp2) Process(wg *sync.WaitGroup) {
d.MainData.Name = "processed by DataTyp2"
// send result to final channel, you can change it as you want
finalResult <- d.MainData.Name
wg.Done()
}
//dispatcher will run new go routine for each request.
//you can implement a worker pool to preventing running too many go routines.
func dispatcher(incomingData *MainData, wg *sync.WaitGroup) {
// based on your requirements you can remove this go routing or not
go func() {
var p IData
p = DataTyp1{incomingData}
if p.IsThisForMe() {
go p.Process(wg)
return
}
p = DataTyp2{incomingData}
if p.IsThisForMe() {
go p.Process(wg)
return
}
}()
}
func main() {
dummyDataArray := []MainData{
MainData{Id: 2, Name: "this data #2"},
MainData{Id: 3, Name: "this data #3"},
}
wg := sync.WaitGroup{}
for i := range dummyDataArray {
wg.Add(1)
dispatcher(&dummyDataArray[i], &wg)
}
result := make([]string, 0)
done := make(chan struct{})
// data collector
go func() {
loop:for {
select {
case <-done:
break loop
case r := <-finalResult:
result = append(result, r)
}
}
}()
wg.Wait()
done<- struct{}{}
for _, s := range result {
fmt.Println(s)
}
}
Note: this is just for opening your mind for finding a better solution, and for sure this is not a production-ready code.

Watch for changes in a queue containing struct

I have two goroutines:
first one adds task to queue
second cleans up from the queue based on status
Add and cleanup might not be simultaneous.
If the status of task is success, I want to delete the task from the queue, if not, I will retry for status to be success (will have time limit). If that fails, I will log and delete from queue.
We can't communicate between add and delete because that is not how the real world scenario works.
I want something like a watcher which monitors addition in queue and does the following cleanup. To increase complexity, Add might be adding even during cleanup is happening (not shown here). I want to implement it without using external packages.
How can I achieve this?
type Task struct {
name string
status string //completed, failed
}
var list []*Task
func main() {
done := make(chan bool)
go Add()
time.Sleep(15)
go clean(done)
<-done
}
func Add() {
t1 := &Task{"test1", "completed"}
t2 := &Task{"test2", "failed"}
list = append(list, t1, t2)
}
func clean() {
for k, v := range list {
if v.status == "completed" {
RemoveIndex(list, k)
} else {
//for now consider this as retry
v.status == "completed"
}
if len(list) > 0 {
clean()
}
<-done
}
}
func RemoveIndex(s []int, index int) []int {
return append(s[:index], s[index+1:]...)
}
so i found a solution which works for me and posting it here for anyone it might be helpful for.
in my main i have added a ticker which runs every x seconds to watch if something is added in the queue.
type Task struct {
name string
status string //completed, failed
}
var list []*Task
func main() {
done := make(chan bool)
c := make(chan os.Signal, 2)
go Add()
go func() {
for {
select {
// case <-done:
// Cleaner(k)
case <-ticker.C:
Monitor(done)
}
}
}()
signal.Notify(c, os.Interrupt, syscall.SIGTERM)
<-c
//waiting for interrupt here
}
func Add() {
t1 := &Task{"test1", "completed"}
t2 := &Task{"test2", "failed"}
list = append(list, t1, t2)
}
func Monitor(done chan bool) {
if len(list) > 0 {
Cleaner()
}
}
func cleaner(){
//do cleaning here
// pop each element from queue and delete
}
func RemoveIndex(s []int, index int) []int {
return append(s[:index], s[index+1:]...)
}
so now this solution does not need to depend on communication between go routines,
in a real world scenario, the programme never dies and keeps adding and cleaning based on use case.you can optimize better by locking and unlocking before addition to queue and deletion from queue.

Stop for loop by passing empty struct down channel Go

I am attempting to create a poller in Go that spins up and every 24 hours executes a function.
I want to also be able to stop the polling, I'm attempting to do this by having a done channel and passing down an empty struct to stop the for loop.
In my tests, the for just loops infinitely and I can't seem to stop it, am I using the done channel incorrectly? The ticker case works as expected.
Poller struct {
HandlerFunc HandlerFunc
interval *time.Ticker
done chan struct{}
}
func (p *Poller) Start() error {
for {
select {
case <-p.interval.C:
err := p.HandlerFunc()
if err != nil {
return err
}
case <-p.done:
return nil
}
}
}
func (p *Poller) Stop() {
p.done <- struct{}{}
}
Here is the test that's exeuting the code and causing the infinite loop.
poller := poller.NewPoller(
testHandlerFunc,
time.NewTicker(1*time.Millisecond),
)
err := poller.Start()
assert.Error(t, err)
poller.Stop()
Seems like problem is in your use case, you calling poller.Start() in blocking maner, so poller.Stop() is never called. It's common, in go projects to call goroutine inside of Start/Run methods, so, in poller.Start(), i would do something like that:
func (p *Poller) Start() <-chan error {
errc := make(chan error, 1 )
go func() {
defer close(errc)
for {
select {
case <-p.interval.C:
err := p.HandlerFunc()
if err != nil {
errc <- err
return
}
case <-p.done:
return
}
}
}
return errc
}
Also, there's no need to send empty struct to done channel. Closing channel like close(p.done) is more idiomatic for go.
There is no explicit way in Go to broadcast an event to go routines for something like cancellation. Instead its idiomatic to create a channel that when closed signifies a message such as cancelling any work it has to do. Something like this is a viable pattern:
var done = make(chan struct{})
func cancelled() bool {
select {
case <-done:
return true
default:
return false
}
}
Go-routines can call cancelled to poll for a cancellation.
Then your main loop can respond to such an event but make sure you drain any channels that might cause go-routines to block.
for {
select {
case <-done:
// Drain whatever channels you need to.
for range someChannel { }
return
//.. Other cases
}
}

Go: one channel with multiple listeners

I'm pretty new to Go so sorry if the topic is wrong but I hope you understand my question. I want to process events to different go routines via a channel. Here is some sample code
type Event struct {
Host string
Command string
Output string
}
var (
incoming = make(chan Event)
)
func processEmail(ticker* time.Ticker) {
for {
select {
case t := <-ticker.C:
fmt.Println("Email Tick at", t)
case e := <-incoming:
fmt.Println("EMAIL GOT AN EVENT!")
fmt.Println(e)
}
}
}
func processPagerDuty(ticker* time.Ticker) {
for {
select {
case t := <-ticker.C:
fmt.Println("Pagerduty Tick at", t)
case e := <-incoming:
fmt.Println("PAGERDUTY GOT AN EVENT!")
fmt.Println(e)
}
}
}
func main() {
err := gcfg.ReadFileInto(&cfg, "dispatch-api.cfg")
if err != nil {
fmt.Printf("Error loading the config")
}
ticker := time.NewTicker(time.Second * 10)
go processEmail(ticker)
ticker := time.NewTicker(time.Second * 1)
go processPagerDuty(ticker)
}
func eventAdd(r render.Render, params martini.Params, req *http.Request) {
// create an event now
e := Event{Host: "web01-east.domain.com", Command: "foo", Output: "bar"}
incoming <- e
}
So the ticker events work just create. When I issue an API call to create an event I just get output from the processEmail function. Its whatever go routine is called first will get the event over the channel.
Is there a way for both functions to get that event?
You can use fan in and fan out (from Rob Pike's speech):
package main
func main() {
// feeders - feeder1, feeder2 and feeder3 are used to fan in
// data into one channel
go func() {
for {
select {
case v1 := <-feeder1:
mainChannel <- v1
case v2 := <-feeder2:
mainChannel <- v2
case v3 := <-feeder3:
mainChannel <- v3
}
}
}()
// dispatchers - not actually fan out rather dispatching data
go func() {
for {
v := <-mainChannel
// use this to prevent leaking goroutines
// (i.e. when one consumer got stuck)
done := make(chan bool)
go func() {
consumer1 <- v
done <- true
}()
go func() {
consumer2 <- v
done <- true
}()
go func() {
consumer3 <- v
done <- true
}()
<-done
<-done
<-done
}
}()
// or fan out (when processing the data by just one consumer is enough)
go func() {
for {
v := <-mainChannel
select {
case consumer1 <- v:
case consumer2 <- v:
case consumer3 <- v:
}
}
}()
// consumers(your logic)
go func() { <-consumer1 /* using the value */ }()
go func() { <-consumer2 /* using the value */ }()
go func() { <-consumer3 /* using the value */ }()
}
type payload int
var (
feeder1 = make(chan payload)
feeder2 = make(chan payload)
feeder3 = make(chan payload)
mainChannel = make(chan payload)
consumer1 = make(chan payload)
consumer2 = make(chan payload)
consumer3 = make(chan payload)
)
Channels are a point to point communication method, not a broadcast communication method, so no, you can't get both functions to get the event without doing something special.
You could have separate channels for both goroutines and send the message into each. This is probably the simplest solution.
Or alternatively you could get one goroutine to signal the next one.
Go has two mechanisms for doing broadcast signalling as far as I know. One is closing a channel. This only works a single time though.
The other is to use a sync.Cond lock. These are moderately tricky to use, but will allow you to have multiple goroutines woken up by a single event.
If I was you, I'd go for the first option, send the event to two different channels. That seems to map the problem quite well.

Resources