Noob on channels in Go - go

I'm trying to wrap my head around concurrency patterns in Go and was confused by this example from #69
package main
import "fmt"
func fibonacci(c, quit chan int) {
x, y := 0, 1
for {
select {
case c <- x:
x, y = y, x+y
case <-quit:
fmt.Println("quit")
return
}
}
}
func main() {
c := make(chan int)
quit := make(chan int)
go func() {
for i := 0; i < 10; i++ {
fmt.Println(<-c)
}
quit <- 0
}()
fibonacci(c, quit)
}
In particular, I don't see how
for i := 0; i < 10; i++ {
fmt.Println(<-c)
}
is supposed to work, since all we did was make the channel, and now we "receive" from it 10 times? I tried out other code where I create a channel and then try to receive from it right away and I always get an error, but this seems to work and I can't quite see how. Thanks for any help!

fmt.Println(<-c) will block until there's something to read from the channel. Since we start the for loop in a separate goroutine, it means the first iteration of the loop will simply sit idly and wait until there's something to read.
Then the fibonacci function starts, and pushes data down the channel. This will make the loop wake up and start printing.
I hope it makes better sense now.

I’m giving you a shorter version of the code above, which I think should be easier to understand. (I explain the differences below.) Consider this:
// http://play.golang.org/p/5CrBSu4wxd
package main
import "fmt"
func fibonacci(c chan int) {
x, y := 0, 1
for {
c <- x
x, y = y, x+y
}
}
func main() {
c := make(chan int)
go fibonacci(c)
for i := 0; i < 10; i++ {
fmt.Println(<-c)
}
}
This is a more straightforward version because your main function is clearly just printing 10 values from the channel, and then exiting; and there is a background goroutine that is filling the channel as long as a new value is needed.
This alternate version drops the quit channel, because the background goroutine simply dies when main() finishes (no need to kill it explicitly in such a simple example).
Of course this version also kills the use of select{}, which is the topic of #69. But seeing how both versions accomplish the same thing, except killing of the background goroutine, can perhaps be a good aid in understanding what select is doing.
Note, in particular, that if fibonacci() had a time.Sleep() as its first statement, the for loop would hang for that much time, but would eventually work.
Hope this helps!
P.S.: Just realized this version is just a simpler version than #68, so I’m not sure how much it’ll help. Oops. :-)

Related

How this select works in goroutine?

i have been following the go tour examples, and i don't understand how this works
https://tour.golang.org/concurrency/5
package main
import "fmt"
func fibonacci(c, quit chan int) {
x, y := 0, 1
for {
select {
case c <- x:
x, y = y, x+y
case <-quit:
fmt.Println("quit")
return
}
}
}
func main() {
c := make(chan int)
quit := make(chan int)
go func() {
for i := 0; i < 10; i++ {
fmt.Println(<-c)
}
quit <- 0
}()
fibonacci(c, quit)
}
how is this working?
and while i'm trying to understand.
package main
import "fmt"
func b(c,quit chan int) {
c <-1
c <-2
c <-3
}
func main() {
c := make(chan int)
quit := make(chan int)
go func() {
for i := 0; i < 10; i++ {
fmt.Println(<-c)
}
quit <- 0
}()
b(c,quit)
}
this will sometimes print 1,2 sometimes print 1,2,3, why?
First, in the func fibonacci, the select statement tries select the first thing of the two following to finish:
c <- x
<- quit
It is fairly easy to understand <- quit, which is tries to receive a value from a channel called quit (and ignores the value received).
c <- x means sending a value that equals (is a copy of) x. It seems like unblocking, but in Go, sending over an unbuffered channel (which is explained in the Go tour) blocks when there is no receiver.
So here it means, waiting for a receiver to be ready to receive the value (or a space in the buffer, if it were a buffered channel), which in this code means fmt.Println(<-c), and then send the value to the receiver.
So this statement unblocks (finishes) whenever <-c is evaluated. That is, every iteration of the loop.
And for your code, while all value 1, 2, 3, is guaranteed to be sent over the channel (and received), func b returns and thus func main retains without guaranteeing the fmt.Println(3) finishes.
In Go, when the func main returns, the program terminates, and unfinished goroutine does not get a chance to finish its work - thus sometimes it prints 3 and soetimes it does not.
To better understand the Fibonacci example let's analyze the different parts.
First the anonymous function
go func() {
for i := 0; i < 10; i++ {
fmt.Println(<-c)
}
quit <- 0
}()
The "go" keyword will start a new goroutine, so now we have the "main" goroutine and this one, they're gonna be running concurrently.
The loop is telling us that we are going to execute this 10 times:
fmt.Println(<-c)
This call will block until we receive an integer from the channel and print it. Once this happens 10 times we are gonna signal that this goroutine has finished
quit <- 0
Now let's go back to the main goroutine, while the other goroutine was starting it has invoked the function "finbonacci", there's no "go" keyword so this call is blocking here.
fibonacci(c, quit)
Now let's analyze the function
x, y := 0, 1
for {
select {
case c <- x:
x, y = y, x+y
case <-quit:
fmt.Println("quit")
return
}
}
We start an infinite loop and on each iteration, we will try to execute one case from the select
The first case will try to send Fibonacci's sequence values indefinitely to the channel, at some point, once the other goroutine reaches the fmt.Println(<-c) statement, this case will be executed, the values will be recalculated and the next iteration of the loop will happen.
case c <- x:
x, y = y, x+y
The second case won't be able to run yet, since nobody is sending anything to the "quit" channel.
case <-quit:
fmt.Println("quit")
return
}
After 10 iterations the first goroutine will stop receiving new values, thus the first select case won't be able to run
case c <- x: // this will block after 10 times, nobody is reading
x, y = y, x+y
At this point, no case in the "select" can be executed, the main goroutine is "blocked" (more like paused from a logical point of view).
Finally, at some point, the first goroutine will signal that it has finished using the "quit" channel
quit <- 0
This allows the execution of the second "select" case, which will break the infinite loop and allow the "fibonacci" function to return and the main function will be able to finish in a clean way.
Hopefully, this helps you understand the Fibonacci example.
Now, moving to your code, you are not waiting for the anonymous goroutine to finish, in fact, it's not even finishing. Your "b" method is sending "1,2,3" and returns immediately.
Since it's the last part of the main function, the program terminates.
If you only see "1,2" sometimes is because the "fmt.Println" statement is too slow and the program terminates before it can print the 3.

Why is my code running slower after trying Goroutines?

I decided to try figuring out Goroutines and channels. I made a function that takes a list and adds 10 to every element. I then made another function that attempts to incorporate channels and goroutines. When I timed the code it ran much slower. I tried doing some research but was unable to figure anything out.
Here is my code with channels:
package main
import ("fmt"
"time")
func addTen(channel chan int) {
channel <- 10 + <-channel
}
func listPlusTen(list []int) []int {
channel := make(chan int)
for i:= 0; i < len(list); i++ {
go addTen(channel)
channel <- list[i]
list[i] = <-channel
}
return list
}
func main(){
var size int
list := make([]int, 0)
fmt.Print("Enter the list size: ")
fmt.Scanf("%d", &size)
for i:=0; i <= size; i++ {
list = append(list, i)
}
start := time.Now()
list = listPlusTen(list)
end := time.Now()
fmt.Println(end.Sub(start))
}
You are adding a lot of synchronization overhead to the baseline algorithm. You have len(list) goroutines, all waiting to read from a common channel. When you write to the channel, the scheduler picks one of those goroutines, and that goroutines adds 10, and writes to the channel, which enables the main goroutine again. It is hard to speculate without really measuring it, but if you move the goroutine creation outside the for-loop, then you will have only one goroutine, reducing the scheduler overhead. However, in any comparison to the baseline algorithm this will be slower, because each operation involves two synchronizations and two context switches, which take more that the algorithm itself.
I had a similar experience with go routines where I was making 20 http requests using the same function.
calling the function in a simple loop was much faster than using wait groups

How can we determine when the "last" worker process/thread is finished in Go?

I'll use a hacky inefficient prime number finder to make this question a little more concrete.
Let's say our main function fires off a bunch of "worker" goroutines. They will report their results to a single channnel which prints them. But not every worker will report something so we can't use a counter to know when the last job is finished. Or is there a way?
For the concrete example, here, main fires off goroutines to check whether the values 2...1000 are prime (yeah I know it is inefficient).
package main
import (
"fmt"
"time"
)
func main() {
c := make(chan int)
go func () {
for {
fmt.Print(" ", <- c)
}
}()
for n := 2; n < 1000; n++ {
go printIfPrime(n, c)
}
time.Sleep(2 * time.Second) // <---- THIS FEELS WRONG
}
func printIfPrime(n int, channel chan int) {
for d := 2; d * d <= n; d++ {
if n % d == 0 {
return
}
}
channel <- n
}
My problem is that I don't know how to reliably stop it at the right time. I tried adding a sleep at the end of main and it works (but it might take too long, and this is no way to write concurrent code!). I would like to know if there was a way to send a stop signal through a channel or something so main can stop at the right time.
The trick here is that I don't know how many worker responses there will be.
Is this impossible or is there a cool trick?
(If there's an answer for this prime example, great. I can probably generalize. Or maybe not. Maybe this is app specific?)
Use a WaitGroup.
The following code uses two WaitGroups. The main function uses wgTest to wait for print_if_prime functions to complete. Once they are done, it closes the channel to break the for loop in the printing goroutine. The main function uses wgPrint to wait for printing to complete.
package main
import (
"fmt"
"sync"
)
func main() {
c := make(chan int)
var wgPrint, wgTest sync.WaitGroup
wgPrint.Add(1)
go func(wg *sync.WaitGroup) {
defer wg.Done()
for n := range c {
fmt.Print(" ", n)
}
}(&wgPrint)
for n := 2; n < 1000; n++ {
wgTest.Add(1)
go print_if_prime(&wgTest, n, c)
}
wgTest.Wait()
close(c)
wgPrint.Wait()
}
func print_if_prime(wg *sync.WaitGroup, n int, channel chan int) {
defer wg.Done()
for d := 2; d*d <= n; d++ {
if n%d == 0 {
return
}
}
channel <- n
}
playground example

Better go-idiomatic way of writing this code?

Playing around with go, I threw together this code:
package main
import "fmt"
const N = 10
func main() {
ch := make(chan int, N)
done := make(chan bool)
for i := 0; i < N; i++ {
go (func(n int, ch chan int, done chan bool) {
for i := 0; i < N; i++ {
ch <- n*N + i
}
done <- true
})(i, ch, done)
}
numDone := 0
for numDone < N {
select {
case i := <-ch:
fmt.Println(i)
case <-done:
numDone++
}
}
for {
select {
case i := <-ch:
fmt.Println(i)
default:
return
}
}
}
Basically I have N channels doing some work and reporting it on the same channel -- I want to know when all the channels are done. So I have this other done channel that each worker goroutine sends a message on (message doesn't matter), and this causes main to count that thread as done. When the count gets to N, we're actually done.
Is this "good" go? Is there a more go-idiomatic way of doing this?
edit: To clarify a bit, I'm doubtful because the done channel seems to be doing a job that channel closing seems to be for, but of course I can't actually close the channel in any goroutine because all the routines share the same channel. So I'm using done to simulate a channel that does some kind of "buffered closing".
edit2: Original code wasn't really working since sometimes the done signal from a routine was read before the int it just put on ch. Needs a "cleanup" loop.
Here is an idiomatic use of sync.WaitGroup for you to study
(playground link)
package main
import (
"fmt"
"sync"
)
const N = 10
func main() {
ch := make(chan int, N)
var wg sync.WaitGroup
for i := 0; i < N; i++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
for i := 0; i < N; i++ {
ch <- n*N + i
}
}(i)
}
go func() {
wg.Wait()
close(ch)
}()
for i := range ch {
fmt.Println(i)
}
}
Note the use of closures in the two go routine definitions and note the second go statement to wait for all the routines to finish, then close the channel, so range can be used.
looks like you want a sync.WaitGroup (http://golang.org/pkg/sync/#WaitGroup)
Just use a WaitGroup! They are the built-in primitive that essentially let you wait for stuff in different goroutines to finish up.
http://golang.org/pkg/sync/#WaitGroup
As for your doubts, The way to thing about is that being done by closing a channel (done permanently) and being done with work (temporarily) are different.
In the first approximation the code seems more or less okay to me.
Wrt the details, the 'ch' should be buffered. Also the 'done' channel goroutine "accounting" might be possibly replaced with sync.WaitGroup.
If you're iterating over values generated from goroutines, you can iterate directly over the
communication channel:
for value := range ch {
println(value)
}
The only thing necessary for this is, that the channel ch is closed later on, or else the
loop would wait for new values forever.
This would effectively replace your for numDone < N when used in combination with sync.WaitGroup.
I was dealing with the same issue in some code of mine and found this to be a more than adequate solution.
The answer provides Go's idiom for handling multiple goroutines all sending across a single channel.

Python-style generators in Go

I'm currently working through the Tour of Go, and I thought that goroutines have been used similarly to Python generators, particularly with Question 66. I thought 66 looked complex, so I rewrote it to this:
package main
import "fmt"
func fibonacci(c chan int) {
x, y := 1, 1
for {
c <- x
x, y = y, x + y
}
}
func main() {
c := make(chan int)
go fibonacci(c)
for i := 0; i < 10; i++ {
fmt.Println(<-c)
}
}
This seems to work. A couple of questions:
If I turn up the buffer size on the channel to say, 10, fibonacci would fill up 10 further spots, as quickly as possible, and main would eat up the spots as quickly as it could go. Is this right? This would be more performant than a buffer size of 1 at the expense of memory, correct?
As the channel doesn't get closed by the fibonacci sender, what happens memory-wise when we go out of scope here? My expectation is that once c and go fibonacci is out of scope, the channel and everything on it gets garbage-collected. My gut tells me this is probably not what happens.
Yes, increasing the buffer size might drastically increase the execution speed of your program, because it will reduce the number of context switches. Goroutines aren't garbage-collected, but channels are. In your example, the fibonacci goroutine will run forever (waiting for another goroutine to read from the channel c), and the channel c will never be destroyed, because the fib-goroutine is still using it.
Here is another, sightly different program, which does not leak memory and is imho more similar to Python's generators:
package main
import "fmt"
func fib(n int) chan int {
c := make(chan int)
go func() {
x, y := 0, 1
for i := 0; i <= n; i++ {
c <- x
x, y = y, x+y
}
close(c)
}()
return c
}
func main() {
for i := range fib(10) {
fmt.Println(i)
}
}
Alternatively, if you do not know how many Fibonacci numbers you want to generate, you have to use another quit channel so that you can send the generator goroutine a signal when it should stop. This is whats explained in golang's tutorial https://tour.golang.org/concurrency/4.
I like #tux21b's answer; having the channel created in the fib() function makes the calling code nice and clean. To elaborate a bit, you only need a separate 'quit' channel if there's no way to tell the function when to stop when you call it. If you only ever care about "numbers up to X", you can do this:
package main
import "fmt"
func fib(n int) chan int {
c := make(chan int)
go func() {
x, y := 0, 1
for x < n {
c <- x
x, y = y, x+y
}
close(c)
}()
return c
}
func main() {
// Print the Fibonacci numbers less than 500
for i := range fib(500) {
fmt.Println(i)
}
}
If you want the ability to do either, this is a little sloppy, but I personally like it better than testing the condition in the caller and then signalling a quit through a separate channel:
func fib(wanted func (int, int) bool) chan int {
c := make(chan int)
go func() {
x, y := 0, 1
for i := 0; wanted(i, x); i++{
c <- x
x, y = y, x+y
}
close(c)
}()
return c
}
func main() {
// Print the first 10 Fibonacci numbers
for n := range fib(func(i, x int) bool { return i < 10 }) {
fmt.Println(n)
}
// Print the Fibonacci numbers less than 500
for n := range fib(func(i, x int) bool { return x < 500 }) {
fmt.Println(n)
}
}
I think it just depends on the particulars of a given situation whether you:
Tell the generator when to stop when you create it by
Passing an explicit number of values to generate
Passing a goal value
Passing a function that determines whether to keep going
Give the generator a 'quit' channel, test the values yourself, and tell it to quit when appropriate.
To wrap up and actually answer your questions:
Increasing the channel size would help performance due to fewer context switches. In this trivial example, neither performance nor memory consumption are going to be an issue, but in other situations, buffering the channel is often a very good idea. The memory used by make (chan int, 100) hardly seems significant in most cases, but it could easily make a big performance difference.
You have an infinite loop in your fibonacci function, so the goroutine running it will run (block on c <- x, in this case) forever. The fact that (once c goes out of scope in the caller) you won't ever again read from the channel you share with it doesn't change that. And as #tux21b pointed out, the channel will never be garbage collected since it's still in use. This has nothing to do with closing the channel (the purpose of which is to let the receiving end of the channel know that no more values will be coming) and everything to do with not returning from your function.
You could use closures to simulate a generator. Here is the example from golang.org.
package main
import "fmt"
// fib returns a function that returns
// successive Fibonacci numbers.
func fib() func() int {
a, b := 0, 1
return func() int {
a, b = b, a+b
return a
}
}
func main() {
f := fib()
// Function calls are evaluated left-to-right.
fmt.Println(f(), f(), f(), f(), f())
}
Using channels to emulate Python generators kind of works, but they introduce concurrency where none is needed, and it adds more complication than's probably needed. Here, just keeping the state explicitly is easier to understand, shorter, and almost certainly more efficient. It makes all your questions about buffer sizes and garbage collection moot.
type fibState struct {
x, y int
}
func (f *fibState) Pop() int {
result := f.x
f.x, f.y = f.y, f.x + f.y
return result
}
func main() {
fs := &fibState{1, 1}
for i := 0; i < 10; i++ {
fmt.Println(fs.Pop())
}
}

Resources