Catching return values from goroutines - go

The below code gives compilation error saying 'unexpected go':
x := go doSomething(arg)
func doSomething(arg int) int{
...
return my_int_value
}
I know, I can fetch the return value if I call the function normally i.e. without using goroutine or I can use channels etc.
My question is why is it not possible to fetch a return value like this from a goroutine.

Why is it not possible to fetch a return value from a goroutine assigning it to a variable?
Run goroutine (asynchronously) and fetch return value from function are essentially contradictory actions. When you say go you mean "do it asynchronously" or even simpler: "Go on! Don't wait for the function execution be finished". But when you assign function return value to a variable you are expecting to have this value within the variable. So when you do that x := go doSomething(arg) you are saying: "Go on, don't wait for the function! Wait-wait-wait! I need a returned value be accessible in x var right in the next line below!"
Channels
The most natural way to fetch a value from a goroutine is channels. Channels are the pipes that connect concurrent goroutines. You can send values into channels from one goroutine and receive those values into another goroutine or in a synchronous function. You could easily obtain a value from a goroutine not breaking concurrency using select:
func main() {
c1 := make(chan string)
c2 := make(chan string)
go func() {
time.Sleep(time.Second * 1)
c1 <- "one"
}()
go func() {
time.Sleep(time.Second * 2)
c2 <- "two"
}()
for i := 0; i < 2; i++ {
// Await both of these values
// simultaneously, printing each one as it arrives.
select {
case msg1 := <-c1:
fmt.Println("received", msg1)
case msg2 := <-c2:
fmt.Println("received", msg2)
}
}
}
The example is taken from Go By Example
CSP & message-passing
Go is largerly based on CSP theory. The naive description from above could be precisely outlined in terms of CSP (although I believe it is out of scope of the question). I strongly recommend to familiarize yourself with CSP theory at least because it is RAD. These short quotations give a direction of thinking:
As its name suggests, CSP allows the description of systems in terms of component processes that operate independently, and interact with each other solely through message-passing communication.
In computer science, message passing sends a message to a process and relies on the process and the supporting infrastructure to select and invoke the actual code to run. Message passing differs from conventional programming where a process, subroutine, or function is directly invoked by name.

The strict answer is that you can do that. It's just probably not a good idea. Here's code that would do that:
var x int
go func() {
x = doSomething()
}()
This will spawn off a new goroutine which will calculate doSomething() and then assign the result to x. The problem is: how are you going to use x from the original goroutine? You probably want to make sure the spawned goroutine is done with it so that you don't have a race condition. But if you want to do that, you'll need a way to communicate with the goroutine, and if you've got a way to do that, why not just use it to send the value back?

The idea of the go keyword is that you run the doSomething function asynchronously, and continue the current goroutine without waiting for the result, kind of like executing a command in a Bash shell with an '&' after it. If you want to do
x := doSomething(arg)
// Now do something with x
then you need the current goroutine to block until doSomething finishes. So why not just call doSomething in the current goroutine? There are other options (like, doSomething could post a result to a channel, which the current goroutine receives values from) but simply calling doSomething and assigning the result to a variable is obviously simpler.

It's a design choice by Go creators. There's a whole lot of abstractions/APIs to represent the value of async I/O operations - promise, future, async/await, callback, observable, etc. These abstractions/APIs are inherently tied to the unit of scheduling - coroutines - and these abstractions/APIs dictate how coroutines (or more precisely the return value of async I/O represented by them) can be composed.
Go chose message passing (aka channels) as the abstraction/API to represent the return value of async I/O operations. And of course, goroutines and channels give you a composable tool to implement async I/O operations.

Why not use a channel to write into?
chanRes := make(chan int, 1)
go doSomething(arg, chanRes)
//blocks here or you can use some other sync mechanism (do something else) and wait
x := <- chanRes
func doSomething(arg int, out chan<- int){
...
out <- my_int_value
}

Related

Calling multiple functions with different signatures concurrently

I'd like some feedback on the implementation details of what I'm trying to build. What I want to achieve is have multiple functions with different signatures that can be called concurrently.
Calling the functions in coroutines sequentially works fine, but I'm wondering if there's a way to do this in a more idiomatic way, e.g. iterate over a slice of functions.
Since each function has different arguments and return values though, I have trouble figuring out what the best approach would be. An example that is a bit similar to my goal can be seen here: Golang - How do you create a slice of functions with different signatures?, but there the code just calls the functions and doesn't account for any return values.
Is what I have in mind even possible?
You can use code from linked question and just wrap the v.Call(params) into an anonymous function executing in its own goroutine like this:
...
// WaitGroup to wait on goroutines to finish their execution
var wg sync.WaitGroup
for a, v := range f {
v := reflect.TypeOf(v)
//calling the function from reflect
val := reflect.ValueOf(f[a])
params := make([]reflect.Value, v.NumIn())
if v.NumIn() == 1 {
params[0] = reflect.ValueOf(1564)
} else if v.NumIn() == 2 {
params[0] = reflect.ValueOf("Test FROM reflect")
params[1] = reflect.ValueOf(float32(123456))
}
// Run them in parallel
wg.Add(1)
go func() {
defer wg.Done()
val.Call(params)
}()
}
wg.Wait()
See it on Go Playground
As for return values Value.Call() returns []Value which is slice of return values - so you are covered here too. Your question doesn't specify what you intend to do with results but given they will be generated in parallel you'll probably need to send them through a channel(s) - you can do that in anonymous function (after processing return slice) too.
go func() { MyPackage.MyFunc(with, whatsoever, signature); }() - roughtly, that's what you need. You span as many goroutines (using the go keyword) as there are concurrent functions.
There is no notion of "returned value" from goroutine. For that you have to use channels. They are primary communication mechanism. So, you span a new goroutine with some function f of arbitrary signature and when it's done and you got some result, you send it to some channel shared between goroutines for communication.
Channels are thread-safe and were carefully designed to handle such a communication gracefully. Go, as programming language, provides few keywords that deal with reading/writing to/from channels. So there are pretty fundamental to (concurrent) programming in Go.
However, of course, you can handle it differently. Sharing some mutable memory protected by some kind of locking, or relying upon lockless compareAndSet fashion. Arguably, that's less idiomatic way and generally have to be avoided. Always prefer channels.

Computing mod inverse

I want to compute the inverse element of a prime in modular arithmetic.
In order to speed things up I start a few goroutines which try to find the element in a certain range. When the first one finds the element, it sends it to the main goroutine and at this point I want to terminate the program. So I call close in the main goroutine, but I don't know if the goroutines will finish their execution (I guess not). So a few questions arise:
1) Is this a bad style, should I have something like a WaitGroup?
2) Is there a more idiomatic way to do this computation?
package main
import "fmt"
const (
Procs = 8
P = 1000099
Base = 1<<31 - 1
)
func compute(start, end uint64, finished chan struct{}, output chan uint64) {
for i := start; i < end; i++ {
select {
case <-finished:
return
default:
break
}
if i*P%Base == 1 {
output <- i
}
}
}
func main() {
finished := make(chan struct{})
output := make(chan uint64)
for i := uint64(0); i < Procs; i++ {
start := i * (Base / Procs)
end := (i + 1) * (Base / Procs)
go compute(start, end, finished, output)
}
fmt.Println(<-output)
close(finished)
}
Is there a more idiomatic way to do this computation?
You don't actually need a loop to compute this.
If you use the GCD function (part of the standard library), you get returned numbers x and y such that:
x*P+y*Base=1
this means that x is the answer you want (because x*P = 1 modulo Base):
package main
import (
"fmt"
"math/big"
)
const (
P = 1000099
Base = 1<<31 - 1
)
func main() {
bigP := big.NewInt(P)
bigBase := big.NewInt(Base)
// Compute inverse of bigP modulo bigBase
bigGcd := big.NewInt(0)
bigX := big.NewInt(0)
bigGcd.GCD(bigX,nil,bigP,bigBase)
// x*bigP+y*bigBase=1
// => x*bigP = 1 modulo bigBase
fmt.Println(bigX)
}
Is this a bad style, should I have something like a WaitGroup?
A wait group solves a different problem.
In general, to be a responsible go citizen here and ensure your code runs and tidies up behind itself, you may need to do a combination of:
Signal to the spawned goroutines to stop their calculations when the result of the computation has been found elsewhere.
Ensure a synchronous process waits for the goroutines to stop before returning. This is not mandatory if they properly respond to the signal in #1, but if you don't wait, there will be no guarantee they have terminated before the parent goroutine continues.
In your example program, which performs this task and then quits, there is strictly no need to do either. As this comment indicates, your program's main method terminates upon a satisfactory answer being found, at which point the program will end, any goroutines will be summarily terminated, and the operating system will tidy up any consumed resources. Waiting for goroutines to stop is unnecessary.
However, if you wrapped this code up into a library or it became part of a long running "inverse prime calculation" service, it would be desirable to tidy up the goroutines you spawned to avoid wasting cycles unnecessarily. Additionally, in general, you may have other scenarios in which goroutines store state, hold handles to external resources, or hold handles to internal objects which you risk leaking if not properly tidied away – it is desirable to properly close these.
Communicating the requirement to stop working
There are several approaches to communicate this. I don't claim this is an exhaustive list! (Please do suggest other general-purpose methods in the comments or by proposing edits to the post.)
Using a special channel
Signal the child goroutines by closing a special "shutdown" channel reserved for the purpose. This exploits the channel axiom:
A receive from a closed channel returns the zero value immediately
On receiving from the shutdown channel, the goroutine should immediately arrange to tidy any local state and return from the function. Your earlier question had example code which implemented this; a version of the pattern is:
func myGoRoutine(shutdownChan <-chan struct{}) {
select {
case <-shutdownChan:
// tidy up behaviour goes here
return
// You may choose to listen on other channels here to implement
// the primary behaviour of the goroutine.
}
}
func main() {
shutdownChan := make(chan struct{})
go myGoRoutine(shutdownChan)
// some time later
close(shutdownChan)
}
In this instance, the shutdown logic is wasted because the main() method will immediately return after the call to close. This will race with the shutdown of the goroutine, but we should assume it will not properly execute its tidy-up behaviour. Point 2 addresses ways to fix this.
Using a context
The context package provides the option to create a context which can be cancelled. On cancellation, a channel exposed by the context's Done() method will be closed, which signals time to return from the goroutine.
This approach is approximately the same as the previous method, with the exception of neater encapsulation and the availability of a context to pass to downstream calls in your goroutine to cancel nested calls where desired. Example:
func myGoRoutine(ctx context.Context) {
select {
case <-ctx.Done():
// tidy up behaviour goes here
return
// Put real behaviour for the goroutine here.
}
}
func main() {
// Get a context (or use an existing one if you are provided with one
// outside a `main` method:
ctx := context.Background()
// Create a derived context with a cancellation method
ctx, cancel := context.WithCancel(ctx)
go myGoRoutine(ctx)
// Later, when ready to quit
cancel()
}
This has the same bug as the other case in that the main method will not wait for the child goroutines to quit before returning.
Waiting (or "join"ing) for child goroutines to stop
The code which closes the shutdown channel or closes the context in the above examples will not wait for child goroutines to stop working before continuing. This may be acceptable in some instances, while in others you may require the guarantee that goroutines have stopped before continuing.
sync.WaitGroup can be used to implement this requirement. The documentation is comprehensive. A wait group is a counter which should be incremented using its Add method on starting a goroutine and decremented using its Done method when a goroutine completes. Code can wait for the counter to return to zero by calling its Wait method, which blocks until the condition is true. All calls to Add must occur before a call to Wait.
Example code:
func main() {
var wg sync.WaitGroup
// Increment the WaitGroup with the number of goroutines we're
// spawning.
wg.Add(1)
// It is common to wrap a goroutine in a function which performs
// the decrement on the WaitGroup once the called function returns
// to avoid passing references of this control logic to the
// downstream consumer.
go func() {
// TODO: implement a method to communicate shutdown.
callMyFunction()
wg.Done()
}()
// Indicate shutdown, e.g. by closing a channel or cancelling a
// context.
// Wait for goroutines to stop
wg.Wait()
}
Is there a more idiomatic way to do this computation?
This algorithm is certainly parallelizable through use of goroutines in the manner you have defined. As the work is CPU-bound, the limitation of goroutines to the number of available CPUs makes sense (in the absence of other work on the machine) to benefit from the available compute resource.
See peterSO's answer for a bug fix.

Is there a resource leak here?

func First(query string, replicas ...Search) Result {
c := make(chan Result)
searchReplica := func(i int) {
c <- replicas[i](query)
}
for i := range replicas {
go searchReplica(i)
}
return <-c
}
This function is from the slides of Rob Pike on go concurrency patterns in 2012. I think there is a resource leak in this function. As the function return after the first send & receive pair happens on channel c, the other go routines try to send on channel c. So there is a resource leak here. Anyone knows golang well can confirm this? And how can I detect this leak using what kind of golang tooling?
Yes, you are right (for reference, here's the link to the slide). In the above code only one launched goroutine will terminate, the rest will hang on attempting to send on channel c.
Detailing:
c is an unbuffered channel
there is only a single receive operation, in the return statement
A new goroutine is launched for each element of replicas
each launched goroutine sends a value on channel c
since there is only 1 receive from it, one goroutine will be able to send a value on it, the rest will block forever
Note that depending on the number of elements of replicas (which is len(replicas)):
if it's 0: First() would block forever (no one sends anything on c)
if it's 1: would work as expected
if it's > 1: then it leaks resources
The following modified version will not leak goroutines, by using a non-blocking send (with the help of select with default branch):
searchReplica := func(i int) {
select {
case c <- replicas[i](query):
default:
}
}
The first goroutine ready with the result will send it on channel c which will be received by the goroutine running First(), in the return statement. All other goroutines when they have the result will attempt to send on the channel, and "seeing" that it's not ready (send would block because nobody is ready to receive from it), the default branch will be chosen, and thus the goroutine will end normally.
Another way to fix it would be to use a buffered channel:
c := make(chan Result, len(replicas))
And this way the send operations would not block. And of course only one (the first sent) value will be received from the channel and returned.
Note that the solution with any of the above fixes would still block if len(replicas) is 0. To avoid that, First() should check this explicitly, e.g.:
func First(query string, replicas ...Search) Result {
if len(replicas) == 0 {
return Result{}
}
// ...rest of the code...
}
Some tools / resources to detect leaks:
https://github.com/fortytw2/leaktest
https://github.com/zimmski/go-leak
https://medium.com/golangspec/goroutine-leak-400063aef468
https://blog.minio.io/debugging-go-routine-leaks-a1220142d32c

Why is this code Undefined Behavior?

var x int
done := false
go func() { x = f(...); done = true }
while done == false { }
This is a Go code piece. My fiend told me this is UB code. Why?
As explained in "Why does this program terminate on my system but not on playground?"
The Go Memory Model does not guarantee that the value written to x in the goroutine will ever be observed by the main program.
A similarly erroneous program is given as an example in the section on go routine destruction.
The Go Memory Model also specifically calls out busy waiting without synchronization as an incorrect idiom in this section.
(in your case, there is no guarantee that the value written to done in the goroutine will ever be observed by the main program)
Here, You need to do some kind of synchronization in the goroutine in order to guarantee that done=true happens before one of the iterations of the for loop in main.
The "while" (non-existent in Go) should be replaced by, for instance, a channel you block on (waiting for a communication)
for {
<-c // 2
}
Based on a channel (c := make(chan bool)) created in main, and closed (close(c)) in the goroutine.
The sync package provides other means to wait for a gorountine to end before exiting main.
See for instance Golang Example Wait until all the background goroutine finish:
var w sync.WaitGroup
w.Add(1)
go func() {
// do something
w.Done()
}
w.Wait()

More idiomatic way of adding channel result to queue on completion

So, right now, I just pass a pointer to a Queue object (implementation doesn't really matter) and call queue.add(result) at the end of goroutines that should add things to the queue.
I need that same sort of functionality—and of course doing a loop checking completion with the comma ok syntax is unacceptable in terms of performance versus the simple queue add function call.
Is there a way to do this better, or not?
There are actually two parts to your question: how does one queue data in Go, and how does one use a channel without blocking.
For the first part, it sounds like what you need to do is instead of using the channel to add things to the queue, use the channel as a queue. For example:
var (
ch = make(chan int) // You can add an int parameter to this make call to create a buffered channel
// Do not buffer these channels!
gFinished = make(chan bool)
processFinished = make(chan bool)
)
func f() {
go g()
for {
// send values over ch here...
}
<-gFinished
close(ch)
}
func g() {
// create more expensive objects...
gFinished <- true
}
func processObjects() {
for val := range ch {
// Process each val here
}
processFinished <- true
}
func main() {
go processObjects()
f()
<-processFinished
}
As for how you can make this more asynchronous, you can (as cthom06 pointed out) pass a second integer to the make call in the second line which will make send operations asynchronous until the channel's buffer is full.
EDIT: However (as cthom06 also pointed out), because you have two goroutines writing to the channel, one of them has to be responsible for closing the channel. Also, my previous revision would exit before processObjects could complete. The way I chose to synchronize the goroutines is by creating a couple more channels that pass around dummy values to ensure that the cleanup gets finished properly. Those channels are specifically unbuffered so that the sends happen in lock-step.

Resources