Golang goroutine cannot use function return value(s) - go

I wish I could do something like the following code with golang:
package main
import (
"fmt"
"time"
)
func getA() (int) {
fmt.Println("getA: Calculating...")
time.Sleep(300 * time.Millisecond)
fmt.Println("getA: Done!")
return 100
}
func getB() (int) {
fmt.Println("getB: Calculating...")
time.Sleep(400 * time.Millisecond)
fmt.Println("getB: Done!")
return 200
}
func main() {
A := go getA()
B := go getB()
C := A + B // waits till getA() and getB() return
fmt.Println("Result:", C)
fmt.Println("All done!")
}
More specifically, I wish Go could handle concurrency behind the scene.
This might be a bit off topic, but I am curious what people think of having such implicit concurrency support. Is it worth to put some effort on it? and what are the potential difficulties and drawbacks?
Edit:
To be clear, the question is not about "what Go is doing right now?", and not even "how it is implemented?" though I appreciate #icza post on what exactly we should expect from Go as it is right now. The question is why it does not or is not capable of returning values, and what are the potential complications of doing that?
Back to my simple example:
A := go getA()
B := go getB()
C := A + B // waits till getA() and getB() return
I do not see any issues concerning the scope of variables, at least from the syntax point of view. The scope of A, B, and C is clearly defined by the block they are living inside (in my example the scope is the main() function). However, a perhaps more legitimate question would be if those variables (here A and B) are "ready" to read from? Of course they should not be ready and accessible till getA() and getB() are finished. In fact, this is all I am asking for: The compiler could implement all the bell and whistles behind the scene to make sure the execution will be blocked till A and B are ready to consume (instead of forcing the programmer to explicitly implement those waits and whistles using channels).
This could make the concurrent programming much simpler, especially for the cases where computational tasks are independent of each other. The channels still could be used for explicit communication and synchronization, if really needed.

But this is very easily and idiomatically doable. The language provides the means: Channel types.
Simply pass a channel to the functions, and have the functions send the result on the channel instead of returning them. Channels are safe for concurrent use.
Only one goroutine has access to the value at any given time. Data races cannot occur, by design.
For more, check out the question: If I am using channels properly should I need to use mutexes?
Example solution:
func getA(ch chan int) {
fmt.Println("getA: Calculating...")
time.Sleep(300 * time.Millisecond)
fmt.Println("getA: Done!")
ch <- 100
}
func getB(ch chan int) {
fmt.Println("getB: Calculating...")
time.Sleep(400 * time.Millisecond)
fmt.Println("getB: Done!")
ch <- 200
}
func main() {
cha := make(chan int)
chb := make(chan int)
go getA(cha)
go getB(chb)
C := <-cha + <-chb // waits till getA() and getB() return
fmt.Println("Result:", C)
fmt.Println("All done!")
}
Output (try it on the Go Playground):
getB: Calculating...
getA: Calculating...
getA: Done!
getB: Done!
Result: 300
All done!
Note:
The above example can be implemented with a single channel too:
func main() {
ch := make(chan int)
go getA(ch)
go getB(ch)
C := <-ch + <-ch // waits till getA() and getB() return
fmt.Println("Result:", C)
fmt.Println("All done!")
}
Output is the same. Try this variant on the Go Playground.
Edit:
The Go spec states that the return values of such functions are discarded. More on this: What happens to return value from goroutine.
What you propose bleeds from multiple wounds. Each variable in Go has a scope (in which they can be referred to). Accessing variables do not block. Execution of statements or operators may block (e.g. Receive operator or Send statement).
Your proposal:
go A := getA()
go B := getB()
C := A + B // waits till getA() and getB() return
What is the scope of A and B? 2 reasonable answer would be a) from the go statement or after the go statement. Either way we should be able to access it after the go statement. After the go statement they would be in scope, reading/writing their values should not block.
But then if C := A + B would not block (because it is just reading a variable), then either
a) at this line A and B should already be populated which means the go statement would need to wait for getA() to complete (but then it defeats the purpose of go statement)
b) or else we would need some external code to synchronize but then again we don't gain anything (just make it worse compared to the solution with channels).
Do not communicate by sharing memory; instead, share memory by communicating.
By using channels, it is crystal clear what (may) block and what not. It is crystal clear that when a receive from the channel completes, the goroutine is done. And it gives us the mean to execute the receive whenever we're want to (at the point when its value is needed and we're willing to wait for it), and also the mean to check if the value is ready without blocking (comma-ok idiom).

Related

Doesn't go routine and the channels work in order of call?

Doesn't go routine and the channels worked in the order they were called.
and go routine share values between the region variables?
main.go
var dataSendChannel = make(chan int)
func main() {
a(dataSendChannel)
time.Sleep(time.Second * 10)
}
func a(c chan<- int) {
for i := 0; i < 1000; i++ {
go b(dataSendChannel)
c <- i
}
}
func b(c <-chan int) {
val := <-c
fmt.Println(val)
}
output
> go run main.go
0
1
54
3
61
5
6
7
8
9
Channels are ordered. Goroutines are not. Goroutines may run, or stall, more or less at random, whenever they are logically allowed to run. They must stop and wait whenever you force them to do so, e.g., by attempting to write on a full channel, or using a mutex.Lock() call on an already-locked mutex, or any of those sorts of things.
Your dataSendChannel is unbuffered, so an attempt to write to it will pause until some goroutine is actively attempting to read from it. Function a spins off one goroutine that will attempt one read (go b(...)), then writes and therefore waits for at least one reader to be reading. Function b immediately begins reading, waiting for data. This unblocks function a, which can now write some integer value. Function a can now spin off another instance of b, which begins reading; this may happen before, during, or after the b that got a value begins calling fmt.Println. This second instance of b must now wait for someone—which in this case is always function a, running the loop—to send another value, but a does that as quickly as it can. The second instance of b can now begin calling fmt.Println, but it might, mostly-randomly, not get a chance to do that yet. The first instance of b might already be in fmt.Println, or maybe it isn't yet, and the second one might run first—or maybe both wait around for a while and a third instance of b spins up, reads some value from the channel, and so on.
There's no guarantee which instance of b actually gets into fmt.Println when, so the values you see printed will come out in some semi-random order. If you want the various b instances to sequence themselves, they will need to do that somehow.

Computing mod inverse

I want to compute the inverse element of a prime in modular arithmetic.
In order to speed things up I start a few goroutines which try to find the element in a certain range. When the first one finds the element, it sends it to the main goroutine and at this point I want to terminate the program. So I call close in the main goroutine, but I don't know if the goroutines will finish their execution (I guess not). So a few questions arise:
1) Is this a bad style, should I have something like a WaitGroup?
2) Is there a more idiomatic way to do this computation?
package main
import "fmt"
const (
Procs = 8
P = 1000099
Base = 1<<31 - 1
)
func compute(start, end uint64, finished chan struct{}, output chan uint64) {
for i := start; i < end; i++ {
select {
case <-finished:
return
default:
break
}
if i*P%Base == 1 {
output <- i
}
}
}
func main() {
finished := make(chan struct{})
output := make(chan uint64)
for i := uint64(0); i < Procs; i++ {
start := i * (Base / Procs)
end := (i + 1) * (Base / Procs)
go compute(start, end, finished, output)
}
fmt.Println(<-output)
close(finished)
}
Is there a more idiomatic way to do this computation?
You don't actually need a loop to compute this.
If you use the GCD function (part of the standard library), you get returned numbers x and y such that:
x*P+y*Base=1
this means that x is the answer you want (because x*P = 1 modulo Base):
package main
import (
"fmt"
"math/big"
)
const (
P = 1000099
Base = 1<<31 - 1
)
func main() {
bigP := big.NewInt(P)
bigBase := big.NewInt(Base)
// Compute inverse of bigP modulo bigBase
bigGcd := big.NewInt(0)
bigX := big.NewInt(0)
bigGcd.GCD(bigX,nil,bigP,bigBase)
// x*bigP+y*bigBase=1
// => x*bigP = 1 modulo bigBase
fmt.Println(bigX)
}
Is this a bad style, should I have something like a WaitGroup?
A wait group solves a different problem.
In general, to be a responsible go citizen here and ensure your code runs and tidies up behind itself, you may need to do a combination of:
Signal to the spawned goroutines to stop their calculations when the result of the computation has been found elsewhere.
Ensure a synchronous process waits for the goroutines to stop before returning. This is not mandatory if they properly respond to the signal in #1, but if you don't wait, there will be no guarantee they have terminated before the parent goroutine continues.
In your example program, which performs this task and then quits, there is strictly no need to do either. As this comment indicates, your program's main method terminates upon a satisfactory answer being found, at which point the program will end, any goroutines will be summarily terminated, and the operating system will tidy up any consumed resources. Waiting for goroutines to stop is unnecessary.
However, if you wrapped this code up into a library or it became part of a long running "inverse prime calculation" service, it would be desirable to tidy up the goroutines you spawned to avoid wasting cycles unnecessarily. Additionally, in general, you may have other scenarios in which goroutines store state, hold handles to external resources, or hold handles to internal objects which you risk leaking if not properly tidied away – it is desirable to properly close these.
Communicating the requirement to stop working
There are several approaches to communicate this. I don't claim this is an exhaustive list! (Please do suggest other general-purpose methods in the comments or by proposing edits to the post.)
Using a special channel
Signal the child goroutines by closing a special "shutdown" channel reserved for the purpose. This exploits the channel axiom:
A receive from a closed channel returns the zero value immediately
On receiving from the shutdown channel, the goroutine should immediately arrange to tidy any local state and return from the function. Your earlier question had example code which implemented this; a version of the pattern is:
func myGoRoutine(shutdownChan <-chan struct{}) {
select {
case <-shutdownChan:
// tidy up behaviour goes here
return
// You may choose to listen on other channels here to implement
// the primary behaviour of the goroutine.
}
}
func main() {
shutdownChan := make(chan struct{})
go myGoRoutine(shutdownChan)
// some time later
close(shutdownChan)
}
In this instance, the shutdown logic is wasted because the main() method will immediately return after the call to close. This will race with the shutdown of the goroutine, but we should assume it will not properly execute its tidy-up behaviour. Point 2 addresses ways to fix this.
Using a context
The context package provides the option to create a context which can be cancelled. On cancellation, a channel exposed by the context's Done() method will be closed, which signals time to return from the goroutine.
This approach is approximately the same as the previous method, with the exception of neater encapsulation and the availability of a context to pass to downstream calls in your goroutine to cancel nested calls where desired. Example:
func myGoRoutine(ctx context.Context) {
select {
case <-ctx.Done():
// tidy up behaviour goes here
return
// Put real behaviour for the goroutine here.
}
}
func main() {
// Get a context (or use an existing one if you are provided with one
// outside a `main` method:
ctx := context.Background()
// Create a derived context with a cancellation method
ctx, cancel := context.WithCancel(ctx)
go myGoRoutine(ctx)
// Later, when ready to quit
cancel()
}
This has the same bug as the other case in that the main method will not wait for the child goroutines to quit before returning.
Waiting (or "join"ing) for child goroutines to stop
The code which closes the shutdown channel or closes the context in the above examples will not wait for child goroutines to stop working before continuing. This may be acceptable in some instances, while in others you may require the guarantee that goroutines have stopped before continuing.
sync.WaitGroup can be used to implement this requirement. The documentation is comprehensive. A wait group is a counter which should be incremented using its Add method on starting a goroutine and decremented using its Done method when a goroutine completes. Code can wait for the counter to return to zero by calling its Wait method, which blocks until the condition is true. All calls to Add must occur before a call to Wait.
Example code:
func main() {
var wg sync.WaitGroup
// Increment the WaitGroup with the number of goroutines we're
// spawning.
wg.Add(1)
// It is common to wrap a goroutine in a function which performs
// the decrement on the WaitGroup once the called function returns
// to avoid passing references of this control logic to the
// downstream consumer.
go func() {
// TODO: implement a method to communicate shutdown.
callMyFunction()
wg.Done()
}()
// Indicate shutdown, e.g. by closing a channel or cancelling a
// context.
// Wait for goroutines to stop
wg.Wait()
}
Is there a more idiomatic way to do this computation?
This algorithm is certainly parallelizable through use of goroutines in the manner you have defined. As the work is CPU-bound, the limitation of goroutines to the number of available CPUs makes sense (in the absence of other work on the machine) to benefit from the available compute resource.
See peterSO's answer for a bug fix.

Golang concurrency and blocking with a single channel, explanation needed

I'm playing around with the code presented on https://tour.golang.org/concurrency/5. My idea was that I could simplify the code by getting rid of the quit channel and still keep the correct program behavior - just for learning purposes.
Here is the code I've got (simplified it even more for better readability):
package main
import "fmt"
import "time"
func sendNumbers(c chan int) {
for i := 0; i < 2; i++ {
c <- i
}
}
func main() {
c := make(chan int)
go func() {
for i := 0; i < 2; i++ {
fmt.Println(<-c)
}
}()
time.Sleep(0 * time.Second)
sendNumbers(c)
}
In this code, the go routine I spawn should be able to receive 2 numbers before it returns. The sendNumbers() function I call next sends exactly 2 numbers to the channel c.
So, the expected output of the program is 2 lines: 0 and 1. However, what I'm getting when I run the code on the page, is just a single line containing 0.
Note the
time.Sleep(0 * time.Second)
though that I deliberately added before calling sendNumbers(c). If I change it to
time.Sleep(1 * time.Second)
The output becomes as expected:
0
1
So, I'm confused with what happens. Can someone explain what is going on? Shouldn't both sends and receives block regardless of how much time is passed before I call sendNumbers()?
In Go, the program exits as soon as the main function returns regardless of whether other goroutines are still running or not. If you want to make sure the main function does not exit prematurely, you need some synchronization mechanism. https://nathanleclaire.com/blog/2014/02/15/how-to-wait-for-all-goroutines-to-finish-executing-before-continuing/
You don’t absolutely have to use synchronization primitives, you could also do it with channels only, arguably a more idiomatic way to do it.

How to use channels to safely synchronise data in Go

Below is an example of how to use mutex lock in order to safely access data. How would I go about doing the same with the use of CSP (communication sequential processes) instead of using mutex lock’s and unlock’s?
type Stack struct {
top *Element
size int
sync.Mutex
}
func (ss *Stack) Len() int {
ss.Lock()
size := ss.size
ss.Unlock()
return size
}
func (ss *Stack) Push(value interface{}) {
ss.Lock()
ss.top = &Element{value, ss.top}
ss.size++
ss.Unlock()
}
func (ss *SafeStack) Pop() (value interface{}) {
ss.Lock()
size := ss.size
ss.Unlock()
if size > 0 {
ss.Lock()
value, ss.top = ss.top.value, ss.top.next
ss.size--
ss.Unlock()
return
}
return nil
}
If you actually were to look at how Go implements channels, you'd essentially see a mutex around an array with some additional thread handling to block execution until the value is passed through. A channel's job is to move data from one spot in memory to another with ease. Therefore where you have locks and unlocks, you'd have things like this example:
func example() {
resChan := make(int chan)
go func(){
resChan <- 1
}()
go func(){
res := <-resChan
}
}
So in the example, the first goroutine is blocked after sending the value until the second goroutine reads from the channel.
To do this in Go with mutexes, one would use sync.WaitGroup which will add one to the group on setting the value, then release it from the group and the second goroutine will lock and then unlock the value.
The oddities in your example are 1 no goroutines, so it's all happening in a single main goroutine and the locks are being used more traditionally (as in c thread like) so channels won't really accomplish anything. The example you have would be considered an anti-pattern, like the golang proverb says "Don't communicate by sharing memory, share memory by communicating."

Catching return values from goroutines

The below code gives compilation error saying 'unexpected go':
x := go doSomething(arg)
func doSomething(arg int) int{
...
return my_int_value
}
I know, I can fetch the return value if I call the function normally i.e. without using goroutine or I can use channels etc.
My question is why is it not possible to fetch a return value like this from a goroutine.
Why is it not possible to fetch a return value from a goroutine assigning it to a variable?
Run goroutine (asynchronously) and fetch return value from function are essentially contradictory actions. When you say go you mean "do it asynchronously" or even simpler: "Go on! Don't wait for the function execution be finished". But when you assign function return value to a variable you are expecting to have this value within the variable. So when you do that x := go doSomething(arg) you are saying: "Go on, don't wait for the function! Wait-wait-wait! I need a returned value be accessible in x var right in the next line below!"
Channels
The most natural way to fetch a value from a goroutine is channels. Channels are the pipes that connect concurrent goroutines. You can send values into channels from one goroutine and receive those values into another goroutine or in a synchronous function. You could easily obtain a value from a goroutine not breaking concurrency using select:
func main() {
c1 := make(chan string)
c2 := make(chan string)
go func() {
time.Sleep(time.Second * 1)
c1 <- "one"
}()
go func() {
time.Sleep(time.Second * 2)
c2 <- "two"
}()
for i := 0; i < 2; i++ {
// Await both of these values
// simultaneously, printing each one as it arrives.
select {
case msg1 := <-c1:
fmt.Println("received", msg1)
case msg2 := <-c2:
fmt.Println("received", msg2)
}
}
}
The example is taken from Go By Example
CSP & message-passing
Go is largerly based on CSP theory. The naive description from above could be precisely outlined in terms of CSP (although I believe it is out of scope of the question). I strongly recommend to familiarize yourself with CSP theory at least because it is RAD. These short quotations give a direction of thinking:
As its name suggests, CSP allows the description of systems in terms of component processes that operate independently, and interact with each other solely through message-passing communication.
In computer science, message passing sends a message to a process and relies on the process and the supporting infrastructure to select and invoke the actual code to run. Message passing differs from conventional programming where a process, subroutine, or function is directly invoked by name.
The strict answer is that you can do that. It's just probably not a good idea. Here's code that would do that:
var x int
go func() {
x = doSomething()
}()
This will spawn off a new goroutine which will calculate doSomething() and then assign the result to x. The problem is: how are you going to use x from the original goroutine? You probably want to make sure the spawned goroutine is done with it so that you don't have a race condition. But if you want to do that, you'll need a way to communicate with the goroutine, and if you've got a way to do that, why not just use it to send the value back?
The idea of the go keyword is that you run the doSomething function asynchronously, and continue the current goroutine without waiting for the result, kind of like executing a command in a Bash shell with an '&' after it. If you want to do
x := doSomething(arg)
// Now do something with x
then you need the current goroutine to block until doSomething finishes. So why not just call doSomething in the current goroutine? There are other options (like, doSomething could post a result to a channel, which the current goroutine receives values from) but simply calling doSomething and assigning the result to a variable is obviously simpler.
It's a design choice by Go creators. There's a whole lot of abstractions/APIs to represent the value of async I/O operations - promise, future, async/await, callback, observable, etc. These abstractions/APIs are inherently tied to the unit of scheduling - coroutines - and these abstractions/APIs dictate how coroutines (or more precisely the return value of async I/O represented by them) can be composed.
Go chose message passing (aka channels) as the abstraction/API to represent the return value of async I/O operations. And of course, goroutines and channels give you a composable tool to implement async I/O operations.
Why not use a channel to write into?
chanRes := make(chan int, 1)
go doSomething(arg, chanRes)
//blocks here or you can use some other sync mechanism (do something else) and wait
x := <- chanRes
func doSomething(arg int, out chan<- int){
...
out <- my_int_value
}

Resources