Python-style generators in Go - go

I'm currently working through the Tour of Go, and I thought that goroutines have been used similarly to Python generators, particularly with Question 66. I thought 66 looked complex, so I rewrote it to this:
package main
import "fmt"
func fibonacci(c chan int) {
x, y := 1, 1
for {
c <- x
x, y = y, x + y
}
}
func main() {
c := make(chan int)
go fibonacci(c)
for i := 0; i < 10; i++ {
fmt.Println(<-c)
}
}
This seems to work. A couple of questions:
If I turn up the buffer size on the channel to say, 10, fibonacci would fill up 10 further spots, as quickly as possible, and main would eat up the spots as quickly as it could go. Is this right? This would be more performant than a buffer size of 1 at the expense of memory, correct?
As the channel doesn't get closed by the fibonacci sender, what happens memory-wise when we go out of scope here? My expectation is that once c and go fibonacci is out of scope, the channel and everything on it gets garbage-collected. My gut tells me this is probably not what happens.

Yes, increasing the buffer size might drastically increase the execution speed of your program, because it will reduce the number of context switches. Goroutines aren't garbage-collected, but channels are. In your example, the fibonacci goroutine will run forever (waiting for another goroutine to read from the channel c), and the channel c will never be destroyed, because the fib-goroutine is still using it.
Here is another, sightly different program, which does not leak memory and is imho more similar to Python's generators:
package main
import "fmt"
func fib(n int) chan int {
c := make(chan int)
go func() {
x, y := 0, 1
for i := 0; i <= n; i++ {
c <- x
x, y = y, x+y
}
close(c)
}()
return c
}
func main() {
for i := range fib(10) {
fmt.Println(i)
}
}
Alternatively, if you do not know how many Fibonacci numbers you want to generate, you have to use another quit channel so that you can send the generator goroutine a signal when it should stop. This is whats explained in golang's tutorial https://tour.golang.org/concurrency/4.

I like #tux21b's answer; having the channel created in the fib() function makes the calling code nice and clean. To elaborate a bit, you only need a separate 'quit' channel if there's no way to tell the function when to stop when you call it. If you only ever care about "numbers up to X", you can do this:
package main
import "fmt"
func fib(n int) chan int {
c := make(chan int)
go func() {
x, y := 0, 1
for x < n {
c <- x
x, y = y, x+y
}
close(c)
}()
return c
}
func main() {
// Print the Fibonacci numbers less than 500
for i := range fib(500) {
fmt.Println(i)
}
}
If you want the ability to do either, this is a little sloppy, but I personally like it better than testing the condition in the caller and then signalling a quit through a separate channel:
func fib(wanted func (int, int) bool) chan int {
c := make(chan int)
go func() {
x, y := 0, 1
for i := 0; wanted(i, x); i++{
c <- x
x, y = y, x+y
}
close(c)
}()
return c
}
func main() {
// Print the first 10 Fibonacci numbers
for n := range fib(func(i, x int) bool { return i < 10 }) {
fmt.Println(n)
}
// Print the Fibonacci numbers less than 500
for n := range fib(func(i, x int) bool { return x < 500 }) {
fmt.Println(n)
}
}
I think it just depends on the particulars of a given situation whether you:
Tell the generator when to stop when you create it by
Passing an explicit number of values to generate
Passing a goal value
Passing a function that determines whether to keep going
Give the generator a 'quit' channel, test the values yourself, and tell it to quit when appropriate.
To wrap up and actually answer your questions:
Increasing the channel size would help performance due to fewer context switches. In this trivial example, neither performance nor memory consumption are going to be an issue, but in other situations, buffering the channel is often a very good idea. The memory used by make (chan int, 100) hardly seems significant in most cases, but it could easily make a big performance difference.
You have an infinite loop in your fibonacci function, so the goroutine running it will run (block on c <- x, in this case) forever. The fact that (once c goes out of scope in the caller) you won't ever again read from the channel you share with it doesn't change that. And as #tux21b pointed out, the channel will never be garbage collected since it's still in use. This has nothing to do with closing the channel (the purpose of which is to let the receiving end of the channel know that no more values will be coming) and everything to do with not returning from your function.

You could use closures to simulate a generator. Here is the example from golang.org.
package main
import "fmt"
// fib returns a function that returns
// successive Fibonacci numbers.
func fib() func() int {
a, b := 0, 1
return func() int {
a, b = b, a+b
return a
}
}
func main() {
f := fib()
// Function calls are evaluated left-to-right.
fmt.Println(f(), f(), f(), f(), f())
}

Using channels to emulate Python generators kind of works, but they introduce concurrency where none is needed, and it adds more complication than's probably needed. Here, just keeping the state explicitly is easier to understand, shorter, and almost certainly more efficient. It makes all your questions about buffer sizes and garbage collection moot.
type fibState struct {
x, y int
}
func (f *fibState) Pop() int {
result := f.x
f.x, f.y = f.y, f.x + f.y
return result
}
func main() {
fs := &fibState{1, 1}
for i := 0; i < 10; i++ {
fmt.Println(fs.Pop())
}
}

Related

Range for loop over an unBuffered Channel

I'm new to golang and going over the gotour. I have following code which works perfectly as it should.
package main
import (
"fmt"
)
func fibonacci(n int, c chan int) {
x, y := 0, 1
for i := 0; i < n; i++ {
c <- x
x, y = y, x+y
}
close(c)
}
func main() {
c := make(chan int, 5)
// c := make(chan int) //doesn't work, why ?
go fibonacci(cap(c), c)
for i := range c {
fmt.Println(i)
}
}
But when I use an unbuffered channel instead of a buffered one, I don't get any output, why's that so ?
When you pass cap(c) through to the fibonacci function, what value is passed through? on the buffered channel the n == 5, on the unbuffered channel n == 0
and your for loop
for i := 0; i < 0; i++ {
Actually, this is a really bad way of handling the situation. You are requiring the number of channels to be equal to the number of iterations.
Using a channel in this way I would not recommend, think of the channel as being able to operate concurrently, which is not something you would want to do in this scenario!
If you pass the number in separately to the number of routines, then the unbuffered channel will work as expected:
https://play.golang.org/p/G1b2vjTUCsV
cap(c) will be zero if channel is un-buffered . See the modified program

Why doesn't concurrency speed up my fibonacci function?

Here is the following concurrency example from A Tour of Go
package main
import (
"fmt"
)
func fibonacci(n int, c chan int) {
x, y := 0, 1
for i := 0; i < n; i++ {
c <- x
x, y = y, x+y
}
close(c)
}
func main() {
c := make(chan int, 10)
go fibonacci(cap(c), c)
for i := range c {
fmt.Println(i)
}
}
I modified it to not use goroutines:
package main
import (
"fmt"
)
func fibonacci(n int) int{
if(n==0||n==1){
return 1
}
x:= 1
y:= 1
for i := 0; i < n; i++ {
tmp := x
x = y
y = tmp + y
fmt.Println(x)
}
return x
}
func main(){
fibonacci2(100)
}
However, the time it takes are both nearly instant at n = 100000.
Does anyone have an example where goroutines does speed up calculations? I am wondering if perhaps there are some compiler settings that is limiting the number of cores my program can use.
Why doesn't the goroutines speed up the calculations?
These 2 versions take almost exactly the same time because most of the work is in the Fibonacci function, so it doesn't matter whether it runs on the main goroutine, or a separate goroutine. When n is large, the concurrent version can be slower, because of the communication overhead over channels.
You can see from the above diagram, the only work running on the main thread are 'Println' calls, which take very little time to run.
But if the processing of the number on the main thread takes more time, using a goroutine to generate fibonacci number may be faster.

How can we determine when the "last" worker process/thread is finished in Go?

I'll use a hacky inefficient prime number finder to make this question a little more concrete.
Let's say our main function fires off a bunch of "worker" goroutines. They will report their results to a single channnel which prints them. But not every worker will report something so we can't use a counter to know when the last job is finished. Or is there a way?
For the concrete example, here, main fires off goroutines to check whether the values 2...1000 are prime (yeah I know it is inefficient).
package main
import (
"fmt"
"time"
)
func main() {
c := make(chan int)
go func () {
for {
fmt.Print(" ", <- c)
}
}()
for n := 2; n < 1000; n++ {
go printIfPrime(n, c)
}
time.Sleep(2 * time.Second) // <---- THIS FEELS WRONG
}
func printIfPrime(n int, channel chan int) {
for d := 2; d * d <= n; d++ {
if n % d == 0 {
return
}
}
channel <- n
}
My problem is that I don't know how to reliably stop it at the right time. I tried adding a sleep at the end of main and it works (but it might take too long, and this is no way to write concurrent code!). I would like to know if there was a way to send a stop signal through a channel or something so main can stop at the right time.
The trick here is that I don't know how many worker responses there will be.
Is this impossible or is there a cool trick?
(If there's an answer for this prime example, great. I can probably generalize. Or maybe not. Maybe this is app specific?)
Use a WaitGroup.
The following code uses two WaitGroups. The main function uses wgTest to wait for print_if_prime functions to complete. Once they are done, it closes the channel to break the for loop in the printing goroutine. The main function uses wgPrint to wait for printing to complete.
package main
import (
"fmt"
"sync"
)
func main() {
c := make(chan int)
var wgPrint, wgTest sync.WaitGroup
wgPrint.Add(1)
go func(wg *sync.WaitGroup) {
defer wg.Done()
for n := range c {
fmt.Print(" ", n)
}
}(&wgPrint)
for n := 2; n < 1000; n++ {
wgTest.Add(1)
go print_if_prime(&wgTest, n, c)
}
wgTest.Wait()
close(c)
wgPrint.Wait()
}
func print_if_prime(wg *sync.WaitGroup, n int, channel chan int) {
defer wg.Done()
for d := 2; d*d <= n; d++ {
if n%d == 0 {
return
}
}
channel <- n
}
playground example

Go lang, channel processing sequence

I'm studying Go lang through 'A tour of Go', and it's hard to understand Go channel running sequence,
package main
import "fmt"
import "time"
func sum(a []int, c chan int) {
sum := 0
for _, v := range a {
time.Sleep(1000 * time.Millisecond)
sum += v
}
c <- sum // send sum to c
}
func main() {
a := []int{7, 2, 8, -9, 4, 0}
c := make(chan int)
go sum(a[:len(a)/2], c)
go sum(a[len(a)/2:], c)
x, y := <-c, <-c // receive from c
fmt.Println(x, y, x+y)
fmt.Println("Print this first,")
}
If run above the code, I expected,
Print this first,
17 -5 12
Because, Go routine runs as non-blocking, a But, actually it prints,
17 -5 12
Print this first,
The other example that I found in internet,
package main
import "fmt"
type Data struct {
i int
}
func func1(c chan *Data ) {
fmt.Println("Called")
for {
var t *Data;
t = <-c //receive
t.i += 10 //increment
c <- t //send it back
}
}
func main() {
c := make(chan *Data)
t := Data{10}
go func1(c)
println(t.i)
c <- &t //send a pointer to our t
i := <-c //receive the result
println(i.i)
println(t.i)
}
Also, I expected, it prints "Called" first, but the result is
10
20
20
Called
What I am misunderstanding? please help me to understand Go routine and channel.
In your first example, x, y := <-c, <-c will block until it reads off c twice, and then assign values to x, y. Channels aside, you have an assignment, a print statement, then another print statement. Those are all synchronous things, and will happen in the order you state them in. There's no way the second print statement would print first.
The second one is because fmt.Println writes to STDOUT and println to STDERR. If you are consistent (say use println everywhere) then you see:
10
Called
20
20
That's cause there's a race between the first println(t.i) in the main and the println("Called") that's happening in the goroutine. I'm guessing with GOMAXPROCS set to 1, this will happen consistently. With GOMAXPROCS set to NumCPU, I get a mix of results, sometimes looking like the above, and sometimes like this:
10Called
20
20

Noob on channels in Go

I'm trying to wrap my head around concurrency patterns in Go and was confused by this example from #69
package main
import "fmt"
func fibonacci(c, quit chan int) {
x, y := 0, 1
for {
select {
case c <- x:
x, y = y, x+y
case <-quit:
fmt.Println("quit")
return
}
}
}
func main() {
c := make(chan int)
quit := make(chan int)
go func() {
for i := 0; i < 10; i++ {
fmt.Println(<-c)
}
quit <- 0
}()
fibonacci(c, quit)
}
In particular, I don't see how
for i := 0; i < 10; i++ {
fmt.Println(<-c)
}
is supposed to work, since all we did was make the channel, and now we "receive" from it 10 times? I tried out other code where I create a channel and then try to receive from it right away and I always get an error, but this seems to work and I can't quite see how. Thanks for any help!
fmt.Println(<-c) will block until there's something to read from the channel. Since we start the for loop in a separate goroutine, it means the first iteration of the loop will simply sit idly and wait until there's something to read.
Then the fibonacci function starts, and pushes data down the channel. This will make the loop wake up and start printing.
I hope it makes better sense now.
I’m giving you a shorter version of the code above, which I think should be easier to understand. (I explain the differences below.) Consider this:
// http://play.golang.org/p/5CrBSu4wxd
package main
import "fmt"
func fibonacci(c chan int) {
x, y := 0, 1
for {
c <- x
x, y = y, x+y
}
}
func main() {
c := make(chan int)
go fibonacci(c)
for i := 0; i < 10; i++ {
fmt.Println(<-c)
}
}
This is a more straightforward version because your main function is clearly just printing 10 values from the channel, and then exiting; and there is a background goroutine that is filling the channel as long as a new value is needed.
This alternate version drops the quit channel, because the background goroutine simply dies when main() finishes (no need to kill it explicitly in such a simple example).
Of course this version also kills the use of select{}, which is the topic of #69. But seeing how both versions accomplish the same thing, except killing of the background goroutine, can perhaps be a good aid in understanding what select is doing.
Note, in particular, that if fibonacci() had a time.Sleep() as its first statement, the for loop would hang for that much time, but would eventually work.
Hope this helps!
P.S.: Just realized this version is just a simpler version than #68, so I’m not sure how much it’ll help. Oops. :-)

Resources