Idiomatic way to make a request-response communication using channels - go

Maybe I'm just not reading the spec right or my mindset is still stuck with older synchronization methods, but what is the right way in Go to send one type as receive something else as a response?
One way I had come up with was
package main
import "fmt"
type request struct {
out chan string
argument int
}
var input = make(chan *request)
var cache = map[int]string{}
func processor() {
for {
select {
case in := <- input:
if result, exists := cache[in.argument]; exists {
in.out <- result
}
result := fmt.Sprintf("%d", in.argument)
cache[in.argument] = result
in.out <- result
}
}
}
func main() {
go processor()
responseCh := make(chan string)
input <- &request{
responseCh,
1,
}
result := <- responseCh
fmt.Println(result)
}
That cache is not really necessary for this example but otherwise it would cause a datarace.
Is this what I'm supposed to do?

There're plenty of possibilities, depends what is best approach for your problem. When you receive something from a channel, there is nothing like a default way for responding – you need to build the flow by yourself (and you definitely did in the example in your question). Sending a response channel with every request gives you a great flexibility as with every request you can choose where to route the response, but quite often is not necessary.
Here are some other examples:
1. Sending and receiving from the same channel
You can use unbuffered channel for both sending and receiving the responses. This nicely illustrates that unbuffered channels are in fact a synchronisation points in your program. The limitation is of course that we need to send exactly the same type as request and response:
package main
import (
"fmt"
)
func pow2() (c chan int) {
c = make(chan int)
go func() {
for x := range c {
c <- x*x
}
}()
return c
}
func main() {
c := pow2()
c <- 2
fmt.Println(<-c) // = 4
c <- 4
fmt.Println(<-c) // = 8
}
2. Sending to one channel, receiving from another
You can separate input and output channels. You would be able to use buffered version if you wish. This can be used as request/response scenario and would allow you to have a route responsible for sending the requests, another one for processing them and yet another for receiving responses. Example:
package main
import (
"fmt"
)
func pow2() (in chan int, out chan int) {
in = make(chan int)
out = make(chan int)
go func() {
for x := range in {
out <- x*x
}
}()
return
}
func main() {
in, out := pow2()
go func() {
in <- 2
in <- 4
}()
fmt.Println(<-out) // = 4
fmt.Println(<-out) // = 8
}
3. Sending response channel with every request
This is what you've presented in the question. Gives you a flexibility of specifying the response route. This is useful if you want the response to hit the specific processing routine, for example you have many clients with some tasks to do and you want the response to be received by the same client.
package main
import (
"fmt"
"sync"
)
type Task struct {
x int
c chan int
}
func pow2(in chan Task) {
for t := range in {
t.c <- t.x*t.x
}
}
func main() {
var wg sync.WaitGroup
in := make(chan Task)
// Two processors
go pow2(in)
go pow2(in)
// Five clients with some tasks
for n := 1; n < 5; n++ {
wg.Add(1)
go func(x int) {
defer wg.Done()
c := make(chan int)
in <- Task{x, c}
fmt.Printf("%d**2 = %d\n", x, <-c)
}(n)
}
wg.Wait()
}
Worth saying this scenario doesn't necessary need to be implemented with per-task return channel. If the result has some sort of the client context (for example client id), a single multiplexer could be receiving all the responses and then processing them according to the context.
Sometimes it doesn't make sense to involve channels to achieve simple request-response pattern. When designing go programs, I caught myself trying to inject too many channels into the system (just because I think they're really great). Old good function calls is sometimes all we need:
package main
import (
"fmt"
)
func pow2(x int) int {
return x*x
}
func main() {
fmt.Println(pow2(2))
fmt.Println(pow2(4))
}
(And this might be a good solution if anyone encounters similar problem as in your example. Echoing the comments you've received under your question, having to protect a single structure, like cache, it might be better to create a structure and expose some methods, which would protect concurrent use with mutex.)

Related

Any concurrency issues when multiple goroutines access different fields of the same struct

Are there are any concurrency issues if we access mutually exclusive fields of a struct inside different go co-routines?
I remember reading somewhere that if two parallel threads access same object they might get run on different cores of the cpu both having different cpu level caches with different copies of the object in question. (Not related to Go)
Will the below code be sufficient to achieve the functionality correctly or does additional synchronization mechanism needs to be used?
package main
import (
"fmt"
"sync"
)
type structure struct {
x string
y string
}
func main() {
val := structure{}
wg := new(sync.WaitGroup)
wg.Add(2)
go func1(&val, wg)
go func2(&val, wg)
wg.Wait()
fmt.Println(val)
}
func func1(val *structure, wg *sync.WaitGroup) {
val.x = "Test 1"
wg.Done()
}
func func2(val *structure, wg *sync.WaitGroup) {
val.y = "Test 2"
wg.Done()
}
Edit: - for people who ask why not channels unfortunately this is not the actual code I am working on. Both the func has calls to different api and get the data in a struct, those struct has a pragma.DoNotCopy in them ask protobuf auto generator why they thought it was a good idea. So those data can't be sent over the channel, or else i have to create another struct to send the data over or ask the linter to stop complaining. Or i can send a pointer to the object but feel that is also sharing the memory.
You must synchronize when at least one of accesses to shared resources is a write.
Your code is doing write access, yes, but different struct fields have different memory locations. So you are not accessing shared variables.
If you run your program with the race detector, eg. go run -race main.go it will not print a warning.
Now add fmt.Println(val.y) in func1 and run again, it will print:
WARNING: DATA RACE
Write at 0x00c0000c0010 by goroutine 8:
... rest of race warning
The preferred way in Go would be to communicate memory as opposed to share the memory.
In practice that would mean you should use the Go channels as I show in this blogpost.
https://marcofranssen.nl/concurrency-in-go
If you really want to stick with sharing memory you will have to use a Mutex.
https://tour.golang.org/concurrency/9
However that would cause context switching and Go routines synchronization that slows down your program.
Example using channels
package main
import (
"fmt"
"time"
)
type structure struct {
x string
y string
}
func main() {
val := structure{}
c := make(chan structure)
go func1(c)
go func2(c)
func(c chan structure) {
for {
select {
case v, ok := <-c:
if !ok {
return
}
if v.x != "" {
fmt.Printf("Received %v\n", v)
val.x = v.x
}
if v.y != "" {
fmt.Printf("Received %v\n", v)
val.y = v.y
}
if val.x != "" && val.y != "" {
close(c)
}
}
}
}(c)
fmt.Printf("%v\n", val)
}
func func1(c chan<- structure) {
time.Sleep(1 * time.Second)
c <- structure{x: "Test 1"}
}
func func2(c chan<- structure) {
c <- structure{y: "Test 2"}
}

How to coordinate shutdown with many goroutines

Say I have a function
type Foo struct {}
func (a *Foo) Bar() {
// some expensive work - does some calls to redis
}
which gets executed within a goroutine at some point in my app. Lots of these may be executing at any given point. Prior to application termination, I would like to ensure all remaining goroutines have finished their work.
Can I do something like this:
type Foo struct {
wg sync.WaitGroup
}
func (a *Foo) Close() {
a.wg.Wait()
}
func (a *Foo) Bar() {
a.wg.Add(1)
defer a.wg.Done()
// some expensive work - does some calls to redis
}
Assuming here that Bar gets executed within a goroutine and many of these may be running at a given time and that Bar should not be called once Close is called and Close is called upon a sigterm or sigint.
Does this make sense?
Usually I would see the Bar function look like this:
func (a *Foo) Bar() {
a.wg.Add(1)
go func() {
defer a.wg.Done()
// some expensive work - does some calls to redis
}()
}
Yes, WaitGroup is the right answer. You can use WaitGroup.Add at anytime that the counter is greater than zero, as per doc.
Note that calls with a positive delta that occur when the counter is zero must happen before a Wait. Calls with a negative delta, or calls with a positive delta that start when the counter is greater than zero, may happen at any time. Typically this means the calls to Add should execute before the statement creating the goroutine or other event to be waited for. If a WaitGroup is reused to wait for several independent sets of events, new Add calls must happen after all previous Wait calls have returned. See the WaitGroup example.
But one trick is that, you should always keep the counter greater than zero, before Close is called. That usually means you should call wg.Add in NewFoo (or something like that) and wg.Done in Close. And to prevent multiple calls to Done ruining the wait group, you should wrap Close into sync.Once. You may also want to prevent new Bar() from being called.
WaitGroup is one way, however, the Go team introduced the errgroup for your use case exactly. The most inconvenient part of leaf bebop's answer, is the disregard for error handling. Error handling is the reason errgroup exists. And idiomatic go code should never swallow errors.
However, keeping the signatures of your Foo struct, (except a cosmetic workerNumber)—and no error handling—my proposal looks like this:
package main
import (
"fmt"
"math/rand"
"time"
"golang.org/x/sync/errgroup"
)
type Foo struct {
errg errgroup.Group
}
func NewFoo() *Foo {
foo := &Foo{
errg: errgroup.Group{},
}
return foo
}
func (a *Foo) Bar(workerNumber int) {
a.errg.Go(func() error {
select {
// simulates the long running clals
case <-time.After(time.Second * time.Duration(rand.Intn(10))):
fmt.Println(fmt.Sprintf("worker %d completed its work", workerNumber))
return nil
}
})
}
func (a *Foo) Close() {
a.errg.Wait()
}
func main() {
foo := NewFoo()
for i := 0; i < 10; i++ {
foo.Bar(i)
}
<-time.After(time.Second * 5)
fmt.Println("Waiting for workers to complete...")
foo.Close()
fmt.Println("Done.")
}
The benefit here, is that if you introduce error handling in your code (you should), you only need to slightly modify this code: In short, errg.Wait() would return the first redis error, and Close() could propagate this up through the stack (to main, in this case).
Utilizing the context.Context package as well, you would also be able to immediately cancel any running redis call, if one fails. There are examples of this in the errgroup documentation.
I think waiting indefinitely for all the go routines to finish is not the right way.
If one of the go routines get blocked or say it hangs due to some reason and never terminates successfully, what should happen kill the process or wait for go routines to finish ?
Instead you should wait with some timeout and kill the app irrespective of whether all the routines have finished or not.
Edit: Original ans
Thanks #leaf bebop for pointing it out. I misunderstood the question.
Context package can be used to signal all the go routines to handle kill signal.
appCtx, cancel := context.WithCancel(context.Background())
Here appCtx will have to be passed to all the go routines.
On exit signal call cancel().
functions running as go routines can handle how to handle cancel context.
Using context cancellation in Go
A pattern i use a lot is: https://play.golang.org/p/ibMz36TS62z
package main
import (
"fmt"
"sync"
"time"
)
type response struct {
message string
}
func task(i int, done chan response) {
time.Sleep(1 * time.Second)
done <- response{fmt.Sprintf("%d done", i)}
}
func main() {
responses := GetResponses(10)
fmt.Println("all done", len(responses))
}
func GetResponses(n int) []response {
donequeue := make(chan response)
wg := sync.WaitGroup{}
for i := 0; i < n; i++ {
wg.Add(1)
go func(value int) {
defer wg.Done()
task(value, donequeue)
}(i)
}
go func() {
wg.Wait()
close(donequeue)
}()
responses := []response{}
for result := range donequeue {
responses = append(responses, result)
}
return responses
}
this makes it easy to throttle as well: https://play.golang.org/p/a4MKwJKj634
package main
import (
"fmt"
"sync"
"time"
)
type response struct {
message string
}
func task(i int, done chan response) {
time.Sleep(1 * time.Second)
done <- response{fmt.Sprintf("%d done", i)}
}
func main() {
responses := GetResponses(10, 2)
fmt.Println("all done", len(responses))
}
func GetResponses(n, concurrent int) []response {
throttle := make(chan int, concurrent)
for i := 0; i < concurrent; i++ {
throttle <- i
}
donequeue := make(chan response)
wg := sync.WaitGroup{}
for i := 0; i < n; i++ {
wg.Add(1)
<-throttle
go func(value int) {
defer wg.Done()
throttle <- 1
task(value, donequeue)
}(i)
}
go func() {
wg.Wait()
close(donequeue)
}()
responses := []response{}
for result := range donequeue {
responses = append(responses, result)
}
return responses
}

which goroutine is executed when use sleep in go?

i am new in golang recently. i have a question about goroutine when use time.sleep function. here the code.
package main
import (
"fmt"
"time"
)
func go1(msg_chan chan string) {
for {
msg_chan <- "go1"
}
}
func go2(msg_chan chan string) {
for {
msg_chan <- "go2"
}
}
func count(msg_chan chan string) {
for {
msg := <-msg_chan
fmt.Println(msg)
time.Sleep(time.Second * 1)
}
}
func main() {
var c chan string
c = make(chan string)
go go1(c)
go go2(c)
go count(c)
var input string
fmt.Scanln(&input)
}
and output is
go1
go2
go1
go2
go1
go2
i think when count function is execute sleep function, go1 and go2 will execute in random sequence. so the out put maybe like
go1
go1
go2
go2
go2
go1
when i delete the sleep code in count function. the result as i supposed , it's random.
i am stucked in this issue.
thanks.
First thing to notice is that there are three go routines and all of them are independent of each other. The only thing that combines the two go routines with count routine is the channel on which both go routines are sending the values.
time.Sleep is not making the go routines synchronous. On using time.Sleep you are actually letting the count go routine to wait for that long which let other go routine to send the value on the channel which is available for the count go routine to be able to receive.
One more thing that you can do to check it is increase the number of CPU's which will give you the random result.
func GOMAXPROCS(n int) int
GOMAXPROCS sets the maximum number of CPUs that can be executing
simultaneously and returns the previous setting. If n < 1, it does not
change the current setting. The number of logical CPUs on the local
machine can be queried with NumCPU. This call will go away when the
scheduler improves.
The number of CPUs available simultaneously to executing goroutines is
controlled by the GOMAXPROCS shell environment variable, whose default
value is the number of CPU cores available. Programs with the
potential for parallel execution should therefore achieve it by
default on a multiple-CPU machine. To change the number of parallel
CPUs to use, set the environment variable or use the similarly-named
function of the runtime package to configure the run-time support to
utilize a different number of threads. Setting it to 1 eliminates the
possibility of true parallelism, forcing independent goroutines to
take turns executing.
Considering the part where output of the go routine is random, it is always random. But channels most probably work in queue which is FIFO(first in first out) as it depends on which value is available on the channel to b received. So whichever be the value available on the channel to be sent is letting the count go routine to wait and print that value.
Take for an example even if I am using time.Sleep the output is random:
package main
import (
"fmt"
"time"
)
func go1(msg_chan chan string) {
for i := 0; i < 10; i++ {
msg_chan <- fmt.Sprintf("%s%d", "go1:", i)
}
}
func go2(msg_chan chan string) {
for i := 0; i < 10; i++ {
msg_chan <- fmt.Sprintf("%s%d", "go2:", i)
}
}
func count(msg_chan chan string) {
for {
msg := <-msg_chan
fmt.Println(msg)
time.Sleep(time.Second * 1)
}
}
func main() {
var c chan string
c = make(chan string)
go go1(c)
go go2(c)
go count(c)
time.Sleep(time.Second * 20)
fmt.Println("finished")
}
This sometimes leads to race condition which is why we use synchronization either using channels or wait.groups.
package main
import (
"fmt"
"sync"
"time"
)
var wg sync.WaitGroup
func go1(msg_chan chan string) {
defer wg.Done()
for {
msg_chan <- "go1"
}
}
func go2(msg_chan chan string) {
defer wg.Done()
for {
msg_chan <- "go2"
}
}
func count(msg_chan chan string) {
defer wg.Done()
for {
msg := <-msg_chan
fmt.Println(msg)
time.Sleep(time.Second * 1)
}
}
func main() {
var c chan string
c = make(chan string)
wg.Add(1)
go go1(c)
wg.Add(1)
go go2(c)
wg.Add(1)
go count(c)
wg.Wait()
fmt.Println("finished")
}
Now coming to the part where you are using never ending for loop to send the values on a channel. So if you remove the time.Sleep your process will hang since the loop will never stop to send the values on the channel.

Multiplexing Go Routine Output using fanIn function

I was trying to implement an example Go code for using returned channels from go routines without any "reading block" in the main function. Here, a fanIn function accepts channels from two other routines and return which got as input.
Here, the expected output is Random Outputs from two inner routines. But the actual output is always one "ann" followed by a "john", not at all random in any case.
Why am I not getting random output?
Go Playground: http://play.golang.org/p/46CiihtPwD
Actual output:
you say: ann,0
you say: john,0
you say: ann,1
you say: john,1
......
Code:
package main
import (
"fmt"
"time"
)
func main() {
final := fanIn(boring("ann"), boring("john"))
for i := 0; i < 100; i++ {
fmt.Println("you say:", <-final)
}
time.Sleep(4 * time.Second)
}
func boring(msg string) chan string {
c1 := make(chan string)
go func() {
for i := 0; ; i++ {
c1 <- fmt.Sprintf("%s,%d", msg, i)
time.Sleep(time.Second)
}
}()
return c1
}
func fanIn(input1, input2 <-chan string) chan string {
c := make(chan string)
go func() {
for {
c <- <-input1
}
}()
go func() {
for {
c <- <-input2
}
}()
return c
}
No particular reason, that's just how Go happens to schedule the relevant goroutines (basically, you're getting "lucky" that there's a pattern). You can't rely on it. If you really want an actual reliably random result, you'll have to manually mix in randomness somehow.
There's also the Multiplex function from https://github.com/eapache/channels/ (doc: https://godoc.org/github.com/eapache/channels#Multiplex) which does effectively the same thing as your fanIn function. I don't think it would behave any differently in terms of randomness though.

How to wait for all goroutines to finish without using time.Sleep?

This code selects all xml files in the same folder, as the invoked executable and asynchronously applies processing to each result in the callback method (in the example below, just the name of the file is printed out).
How do I avoid using the sleep method to keep the main method from exiting? I have problems wrapping my head around channels (I assume that's what it takes, to synchronize the results) so any help is appreciated!
package main
import (
"fmt"
"io/ioutil"
"path"
"path/filepath"
"os"
"runtime"
"time"
)
func eachFile(extension string, callback func(file string)) {
exeDir := filepath.Dir(os.Args[0])
files, _ := ioutil.ReadDir(exeDir)
for _, f := range files {
fileName := f.Name()
if extension == path.Ext(fileName) {
go callback(fileName)
}
}
}
func main() {
maxProcs := runtime.NumCPU()
runtime.GOMAXPROCS(maxProcs)
eachFile(".xml", func(fileName string) {
// Custom logic goes in here
fmt.Println(fileName)
})
// This is what i want to get rid of
time.Sleep(100 * time.Millisecond)
}
You can use sync.WaitGroup. Quoting the linked example:
package main
import (
"net/http"
"sync"
)
func main() {
var wg sync.WaitGroup
var urls = []string{
"http://www.golang.org/",
"http://www.google.com/",
"http://www.somestupidname.com/",
}
for _, url := range urls {
// Increment the WaitGroup counter.
wg.Add(1)
// Launch a goroutine to fetch the URL.
go func(url string) {
// Decrement the counter when the goroutine completes.
defer wg.Done()
// Fetch the URL.
http.Get(url)
}(url)
}
// Wait for all HTTP fetches to complete.
wg.Wait()
}
WaitGroups are definitely the canonical way to do this. Just for the sake of completeness, though, here's the solution that was commonly used before WaitGroups were introduced. The basic idea is to use a channel to say "I'm done," and have the main goroutine wait until each spawned routine has reported its completion.
func main() {
c := make(chan struct{}) // We don't need any data to be passed, so use an empty struct
for i := 0; i < 100; i++ {
go func() {
doSomething()
c <- struct{}{} // signal that the routine has completed
}()
}
// Since we spawned 100 routines, receive 100 messages.
for i := 0; i < 100; i++ {
<- c
}
}
sync.WaitGroup can help you here.
package main
import (
"fmt"
"sync"
"time"
)
func wait(seconds int, wg * sync.WaitGroup) {
defer wg.Done()
time.Sleep(time.Duration(seconds) * time.Second)
fmt.Println("Slept ", seconds, " seconds ..")
}
func main() {
var wg sync.WaitGroup
for i := 0; i <= 5; i++ {
wg.Add(1)
go wait(i, &wg)
}
wg.Wait()
}
Although sync.waitGroup (wg) is the canonical way forward, it does require you do at least some of your wg.Add calls before you wg.Wait for all to complete. This may not be feasible for simple things like a web crawler, where you don't know the number of recursive calls beforehand and it takes a while to retrieve the data that drives the wg.Add calls. After all, you need to load and parse the first page before you know the size of the first batch of child pages.
I wrote a solution using channels, avoiding waitGroup in my solution the the Tour of Go - web crawler exercise. Each time one or more go-routines are started, you send the number to the children channel. Each time a go routine is about to complete, you send a 1 to the done channel. When the sum of children equals the sum of done, we are done.
My only remaining concern is the hard-coded size of the the results channel, but that is a (current) Go limitation.
// recursionController is a data structure with three channels to control our Crawl recursion.
// Tried to use sync.waitGroup in a previous version, but I was unhappy with the mandatory sleep.
// The idea is to have three channels, counting the outstanding calls (children), completed calls
// (done) and results (results). Once outstanding calls == completed calls we are done (if you are
// sufficiently careful to signal any new children before closing your current one, as you may be the last one).
//
type recursionController struct {
results chan string
children chan int
done chan int
}
// instead of instantiating one instance, as we did above, use a more idiomatic Go solution
func NewRecursionController() recursionController {
// we buffer results to 1000, so we cannot crawl more pages than that.
return recursionController{make(chan string, 1000), make(chan int), make(chan int)}
}
// recursionController.Add: convenience function to add children to controller (similar to waitGroup)
func (rc recursionController) Add(children int) {
rc.children <- children
}
// recursionController.Done: convenience function to remove a child from controller (similar to waitGroup)
func (rc recursionController) Done() {
rc.done <- 1
}
// recursionController.Wait will wait until all children are done
func (rc recursionController) Wait() {
fmt.Println("Controller waiting...")
var children, done int
for {
select {
case childrenDelta := <-rc.children:
children += childrenDelta
// fmt.Printf("children found %v total %v\n", childrenDelta, children)
case <-rc.done:
done += 1
// fmt.Println("done found", done)
default:
if done > 0 && children == done {
fmt.Printf("Controller exiting, done = %v, children = %v\n", done, children)
close(rc.results)
return
}
}
}
}
Full source code for the solution
Here is a solution that employs WaitGroup.
First, define 2 utility methods:
package util
import (
"sync"
)
var allNodesWaitGroup sync.WaitGroup
func GoNode(f func()) {
allNodesWaitGroup.Add(1)
go func() {
defer allNodesWaitGroup.Done()
f()
}()
}
func WaitForAllNodes() {
allNodesWaitGroup.Wait()
}
Then, replace the invocation of callback:
go callback(fileName)
With a call to your utility function:
util.GoNode(func() { callback(fileName) })
Last step, add this line at the end of your main, instead of your sleep. This will make sure the main thread is waiting for all routines to finish before the program can stop.
func main() {
// ...
util.WaitForAllNodes()
}

Resources