golang implementing generator / yield with channels: odd channel behavior - go

The following code implements the yield pattern in golang. As an experiment I was implementing an all permutations generator. However, when I return the slice A to channel, if I do not create a new copy of the array I get an incorrect result.
Please see the code around "???". Can someone explain what happens under the covers here? I thought that since the channel is not buffered, I was guaranteed that after publishing the array's slice to the channel I was ensured that the result would be consumed before continuing.
package main
import (
"fmt"
)
func swap(A []int, i int, j int) {
t := A[i]
A[i] = A[j]
A[j] = t
}
func recurse(A []int, c chan []int, depth int) {
if depth == len(A) {
// ??? Why do I need to copy the data?
// If I do c <- A I get an incorrect answer.
ra := make([]int, len(A))
copy(ra, A)
c <- ra
return
}
for i := depth; i < len(A); i++ {
swap(A, depth, i)
recurse(A, c, depth+1)
swap(A, depth, i)
}
}
func yieldPermutations(A []int, c chan []int) {
recurse(A, c, 0)
close(c)
}
func main() {
A := []int{1, 2, 3}
c2 := make(chan []int)
go yieldPermutations(A, c2)
for v := range c2 {
fmt.Println(v)
}
}
If I do not copy the data, I get the following result:
[1 3 2]
[1 3 2]
[2 3 1]
[2 3 1]
[3 1 2]
[3 1 2]
Obviously, the correct result (which we get with data copy) is:
[1 2 3]
[1 3 2]
[2 1 3]
[2 3 1]
[3 2 1]
[3 1 2]

It's a mistake to think this code is like generators/yield in Python, and that's what's causing your error.
In Python, when you request the next item from a generator, the generator starts executing and stops when the next yield <value> statement is reached. There is no parallelism in Python's generators: the consumer runs until it wants a value, then the generator runs until it produces a value, then the consumer gets the value and continues execution.
In your go code, the goroutine executes concurrently with the code that's consuming items. As soon as an item is read from the channel from the main code, the goroutine works concurrently to produce the next. The goroutine and the consumer both run until they reach the channel send/receive, then the value is sent from the goroutine to the consumer, then they both continue execution.
That means the backing array of A gets modified concurrently as the goroutine works to generate the next item. And that's a race condition which causes your unexpected output. To demonstrate that this is a race, insert time.Sleep(time.Second) after the channel send. Then the code produces the correct results (albeit slowly): https://play.golang.org/p/uEa_k6Brcc

Related

How channel work in using channel to find prime number problem?

I have an approach to solve the find prime number problem using Go like this:
package main
import (
"fmt"
)
// Generate natural seri number: 2,3,4,...
func GenerateNatural() chan int {
ch := make(chan int)
go func() {
for i := 2; ; i++ {
ch <- i
}
}()
return ch
}
// Filter: delete the number which is divisible by a prime number to find prime number
func PrimeFilter(in <-chan int, prime int) chan int {
out := make(chan int)
go func() {
for {
if i := <-in; i%prime != 0 {
out <- i
}
}
}()
return out
}
func main() {
ch := GenerateNatural()
for i := 0; i < 100; i++ {
prime := <-ch
fmt.Printf("%v: %v\n", i+1, prime)
ch = PrimeFilter(ch, prime)
}
}
I have no idea what happen in this approach:
I know that can not print the content of channel without interrupt: Can not print content of channel
Size of channel: Default buffer channel size is 1, that mean:
By default channels are unbuffered, which states that they will only
accept sends (chan <-) if there is a corresponding receive (<- chan)
which are ready to receive the sent value
I can not image how above Go program run!
Could anybody please help to show me the step by step flow of above Go program for first 10 number or so?
This is a pretty convoluted example. In both functions, go func(){...}() creates an anonymous goroutine and runs it asynchronously, then returns the channel which will receive values from the goroutine. PrimeFilter returns a channel which will receive numbers not divisible by a certain candidate.
The idea is that prime := <-ch always takes the first element from the channel. So, to visualize the flow:
GenerateNatural() starts by sending numbers 2, 3, 4... to ch.
First loop iteration:
a. prime := <-ch reads the first (prime) number 2.
b. PrimeFilter(ch, 2) then continues receiving the rest of the numbers (3, 4, 5, ...), and sends numbers not divisible by 2 to the output channel. So, channel returned by PrimeFilter(ch, 2) will receive numbers (3, 5, 7, ...).
c. ch = PrimeFilter(ch, prime) in the main function now replaces the local ch variable with the output of PrimeFilter(ch, 2) from the previous step.
Second loop iteration:
a. prime := <-ch reads the first (prime) number from the current ch instance (this first number is 3).
b. PrimeFilter(ch, 3) then continues receiving the (already filtered) numbers, except for the first one (so, 5, 7, 9, ...), and sends numbers not divisible by 3 to the output channel. So, channel returned by PrimeFilter(ch, 2) will receive numbers 5, 7, 11, ..., because 9 is divisible by 3.
c. ch = PrimeFilter(ch, prime) in the main function now replaces the local ch variable with the output of PrimeFilter(ch, 3) from the previous step.
...

Go tutorial: Channels, Buffered Channels tutorial

I am going through Go's official tutorial and have difficulty understanding the difference between Channel and Buffered Channels. The links to the tutorials are https://tour.golang.org/concurrency/2 and https://tour.golang.org/concurrency/3
In the Channel tutorial, Channel c first received the sum of [7, 2, 8] which is 17 and then received the sum of [-9, 4, 0] which is -5. When reading from c, it first output -5 to x and then 17 to y, in LIFO order:
package main
import "fmt"
func sum(s []int, c chan int) {
sum := 0
for _, v := range s {
sum += v
}
c <- sum // send sum to c
}
func main() {
s := []int{7, 2, 8, -9, 4, 0}
c := make(chan int)
go sum(s[:len(s)/2], c)
go sum(s[len(s)/2:], c)
x, y := <-c, <-c // receive from c
fmt.Println(x, y, x+y)
}
(The above output is -5 17 12)
In the Buffered Channel tutorial, the output is 1 2, in FIFO order:
func main() {
ch := make(chan int, 2)
ch <- 1
ch <- 2
fmt.Println(<-ch)
fmt.Println(<-ch)
}
Why are they different?
The chnnael c, in your 1st example of unbuffered channel, is not acting as LIFO.
Actually it is happening because of go routines. The go routines executes concurrently.
If you tweak your code to debug, add one extra line in sum to print the sum before sending to channel.
package main
import "fmt"
func sum(s []int, c chan int) {
sum := 0
for _, v := range s {
sum += v
}
fmt.Println("slice:", s)
fmt.Println(sum)
c <- sum // send sum to c
}
func main() {
s := []int{7, 2, 8, -9, 4, 0}
c := make(chan int)
go sum(s[:2], c)
go sum(s[2:4], c)
go sum(s[4:6], c)
x, y, z := <-c, <-c, <-c // receive from c
fmt.Println(x, y, z, x+y+z)
}
The output is:
slice: [4 0]
4
slice: [7 2]
9
slice: [8 -9]
-1
4 9 -1 12
So, you can see that x receives the 1st number that was sent through channel.
Furthermore, unbuffered channels sends data directly to receiver.
If you wanna know about the architecture of channels in go, you can watch this talk of gophercon-2017. I found this talk very helpful.

Incorrect values inside goroutines when looping

I have read through CommonMistakes as well as run my code through the -race flag, but I can't seem to pinpoint what is wrong here:
package main
import (
"fmt"
)
func main() {
i := 1
totalHashFields := 6
for i <= totalHashFields {
Combinations(totalHashFields, i, func(c []int) {
fmt.Println("Outside goroutine:", c)
go func(c []int) {
fmt.Println("Inside goroutine:", c)
}(c)
})
i++
}
}
func Combinations(n, m int, emit func([]int)) {
s := make([]int, m)
last := m - 1
var rc func(int, int)
rc = func(i, next int) {
for j := next; j < n; j++ {
s[i] = j
if i == last {
emit(s)
} else {
rc(i+1, j+1)
}
}
return
}
rc(0, 0)
}
(The Combinations function is a combinations algo for those interested)
Here is some of the output from fmt.Println:
Outside goroutine: [0 1 4]
Inside goroutine: [5 5 5]
Outside goroutine: [0 1 2 3 4 5]
Inside goroutine: [5 5 5 5 5 5]
Basically, even though I'm passing c as a parameter to my anonymous go function, the value is consistently different to the value outside of this scope. In the output above, I expected the 2 "Inside" values to also be [0 1 4] and [0 1 2 3 4 5], respectfully.
The problem is that you goroutines all work on distinc int slices but these share a common backing array: After completing Combinations the slice s will be full of 5s. Your c in main shares the underlying backing array with s.
But your goroutines do not start executing until Combinations is done so once they do start, the will see the final value of s which is just 5s.
Here it does not help to pass in the slice like you did as this makes a proper copy of c but not of the backing array.
Try
Combinations(totalHashFields, i, func(c []int) {
fmt.Println("Outside goroutine:", c)
cpy := make([]int, len(c))
copy(cpy, c)
go func(c []int) {
fmt.Println("Inside goroutine:", c)
}(cpy)
})
to make a "deep copy" of c.

Multiple goroutines listening on one channel

I have multiple goroutines trying to receive on the same channel simultaneously. It seems like the last goroutine that starts receiving on the channel gets the value. Is this somewhere in the language spec or is it undefined behaviour?
c := make(chan string)
for i := 0; i < 5; i++ {
go func(i int) {
<-c
c <- fmt.Sprintf("goroutine %d", i)
}(i)
}
c <- "hi"
fmt.Println(<-c)
Output:
goroutine 4
Example On Playground
EDIT:
I just realized that it's more complicated than I thought. The message gets passed around all the goroutines.
c := make(chan string)
for i := 0; i < 5; i++ {
go func(i int) {
msg := <-c
c <- fmt.Sprintf("%s, hi from %d", msg, i)
}(i)
}
c <- "original"
fmt.Println(<-c)
Output:
original, hi from 0, hi from 1, hi from 2, hi from 3, hi from 4
NOTE: the above output is outdated in more recent versions of Go (see comments)
Example On Playground
Yes, it's complicated, But there are a couple of rules of thumb that should make things feel much more straightforward.
prefer using formal arguments for the channels you pass to go-routines instead of accessing channels in global scope. You can get more compiler checking this way, and better modularity too.
avoid both reading and writing on the same channel in a particular go-routine (including the 'main' one). Otherwise, deadlock is a much greater risk.
Here's an alternative version of your program, applying these two guidelines. This case demonstrates many writers & one reader on a channel:
c := make(chan string)
for i := 1; i <= 5; i++ {
go func(i int, co chan<- string) {
for j := 1; j <= 5; j++ {
co <- fmt.Sprintf("hi from %d.%d", i, j)
}
}(i, c)
}
for i := 1; i <= 25; i++ {
fmt.Println(<-c)
}
http://play.golang.org/p/quQn7xePLw
It creates the five go-routines writing to a single channel, each one writing five times. The main go-routine reads all twenty five messages - you may notice that the order they appear in is often not sequential (i.e. the concurrency is evident).
This example demonstrates a feature of Go channels: it is possible to have multiple writers sharing one channel; Go will interleave the messages automatically.
The same applies for one writer and multiple readers on one channel, as seen in the second example here:
c := make(chan int)
var w sync.WaitGroup
w.Add(5)
for i := 1; i <= 5; i++ {
go func(i int, ci <-chan int) {
j := 1
for v := range ci {
time.Sleep(time.Millisecond)
fmt.Printf("%d.%d got %d\n", i, j, v)
j += 1
}
w.Done()
}(i, c)
}
for i := 1; i <= 25; i++ {
c <- i
}
close(c)
w.Wait()
This second example includes a wait imposed on the main goroutine, which would otherwise exit promptly and cause the other five goroutines to be terminated early (thanks to olov for this correction).
In both examples, no buffering was needed. It is generally a good principle to view buffering as a performance enhancer only. If your program does not deadlock without buffers, it won't deadlock with buffers either (but the converse is not always true). So, as another rule of thumb, start without buffering then add it later as needed.
Late reply, but I hope this helps others in the future like Long Polling, "Global" Button, Broadcast to everyone?
Effective Go explains the issue:
Receivers always block until there is data to receive.
That means that you cannot have more than 1 goroutine listening to 1 channel and expect ALL goroutines to receive the same value.
Run this Code Example.
package main
import "fmt"
func main() {
c := make(chan int)
for i := 1; i <= 5; i++ {
go func(i int) {
for v := range c {
fmt.Printf("count %d from goroutine #%d\n", v, i)
}
}(i)
}
for i := 1; i <= 25; i++ {
c<-i
}
close(c)
}
You will not see "count 1" more than once even though there are 5 goroutines listening to the channel. This is because when the first goroutine blocks the channel all other goroutines must wait in line. When the channel is unblocked, the count has already been received and removed from the channel so the next goroutine in line gets the next count value.
I've studied existing solutions and created simple broadcast library https://github.com/grafov/bcast.
group := bcast.NewGroup() // you created the broadcast group
go bcast.Broadcasting(0) // the group accepts messages and broadcast it to all members
member := group.Join() // then you join member(s) from other goroutine(s)
member.Send("test message") // or send messages of any type to the group
member1 := group.Join() // then you join member(s) from other goroutine(s)
val := member1.Recv() // and for example listen for messages
It is complicated.
Also, see what happens with GOMAXPROCS = NumCPU+1. For example,
package main
import (
"fmt"
"runtime"
)
func main() {
runtime.GOMAXPROCS(runtime.NumCPU() + 1)
fmt.Print(runtime.GOMAXPROCS(0))
c := make(chan string)
for i := 0; i < 5; i++ {
go func(i int) {
msg := <-c
c <- fmt.Sprintf("%s, hi from %d", msg, i)
}(i)
}
c <- ", original"
fmt.Println(<-c)
}
Output:
5, original, hi from 4
And, see what happens with buffered channels. For example,
package main
import "fmt"
func main() {
c := make(chan string, 5+1)
for i := 0; i < 5; i++ {
go func(i int) {
msg := <-c
c <- fmt.Sprintf("%s, hi from %d", msg, i)
}(i)
}
c <- "original"
fmt.Println(<-c)
}
Output:
original
You should be able to explain these cases too.
For multiple goroutine listen on one channel, yes, it's possible. the key point is the message itself, you can define some message like that:
package main
import (
"fmt"
"sync"
)
type obj struct {
msg string
receiver int
}
func main() {
ch := make(chan *obj) // both block or non-block are ok
var wg sync.WaitGroup
receiver := 25 // specify receiver count
sender := func() {
o := &obj {
msg: "hello everyone!",
receiver: receiver,
}
ch <- o
}
recv := func(idx int) {
defer wg.Done()
o := <-ch
fmt.Printf("%d received at %d\n", idx, o.receiver)
o.receiver--
if o.receiver > 0 {
ch <- o // forward to others
} else {
fmt.Printf("last receiver: %d\n", idx)
}
}
go sender()
for i:=0; i<reciever; i++ {
wg.Add(1)
go recv(i)
}
wg.Wait()
}
The output is random:
5 received at 25
24 received at 24
6 received at 23
7 received at 22
8 received at 21
9 received at 20
10 received at 19
11 received at 18
12 received at 17
13 received at 16
14 received at 15
15 received at 14
16 received at 13
17 received at 12
18 received at 11
19 received at 10
20 received at 9
21 received at 8
22 received at 7
23 received at 6
2 received at 5
0 received at 4
1 received at 3
3 received at 2
4 received at 1
last receiver 4
Quite an old question, but nobody mentioned this, I think.
First, the outputs of both examples can be different if you run the codes many times. This is not related to the Go version.
The output of the 1st example can be goroutine 4, goroutine 0, goroutine 1,... actually all the goroutine can be a one who sends the string to the main goroutine.
Main goroutine is one of the goroutines, so it's also waiting for data from the channel.
Which goroutine should receive the data? Nobody knows. It's not in the language spec.
Also, the output of the 2nd example also can be anything:
(I added the square brackets just for clarity)
// [original, hi from 4]
// [[[[[original, hi from 4], hi from 0], hi from 2], hi from 1], hi from 3]
// [[[[[original, hi from 4], hi from 1], hi from 0], hi from 2], hi from 3]
// [[[[[original, hi from 0], hi from 2], hi from 1], hi from 3], hi from 4]
// [[original, hi from 4], hi from 1]
// [[original, hi from 0], hi from 4]
// [[[original, hi from 4], hi from 1], hi from 0]
// [[[[[original, hi from 4], hi from 1], hi from 0], hi from 3], hi from 2]
// [[[[original, hi from 0], hi from 2], hi from 1], hi from 3]
//
// ......anything can be the output.
This is not magic, nor a mysterious phenomenon.
If there are multiple threads being executed, no one knows exactly which thread will acquire the resource. The language doesn't determine it. Rather, OS takes care of it. This is why multithread programming is quite complicated.
Goroutine is not OS thread, but it behaves somewhat similarly.
Use sync.Cond is a good choice.
ref: https://pkg.go.dev/sync#Cond

Why are my channels deadlocking?

I am trying to program a simple Go script that calculates the sum of the natural numbers up to 8:
package main
import "fmt"
func sum(nums []int, c chan int) {
var sum int = 0
for _, v := range nums {
sum += v
}
c <- sum
}
func main() {
allNums := []int{1, 2, 3, 4, 5, 6, 7, 8}
c1 := make(chan int)
c2 := make(chan int)
sum(allNums[:len(allNums)/2], c1)
sum(allNums[len(allNums)/2:], c2)
a := <- c1
b := <- c2
fmt.Printf("%d + %d is %d :D", a, b, a + b)
}
However, running this program produces the following output.
throw: all goroutines are asleep - deadlock!
goroutine 1 [chan send]:
main.sum(0x44213af00, 0x800000004, 0x420fbaa0, 0x2f29f, 0x7aaa8, ...)
main.go:9 +0x6e
main.main()
main.go:16 +0xe6
goroutine 2 [syscall]:
created by runtime.main
/usr/local/go/src/pkg/runtime/proc.c:221
exit status 2
Why is my code deadlocking? I am confused because I am using 2 separate channels to calculate the sub-sums. How are the two channels dependent at all?
Your channels are unbuffered, so the c <- sum line in sum() will block until some other routine reads from the other end.
One option would be to add buffers to the channels, so you can write a value to the channel without it blocking:
c1 := make(chan int, 1)
c2 := make(chan int, 1)
Alternatively, if you run the sum() function as a separate goroutine, then it can block while your main() function continues to the point where it reads from the channels.
Yes, you need to add go like
go sum(allNums[:len(allNums)/2], c1)
go sum(allNums[len(allNums)/2:], c2)
or
c1 := make(chan int,1)
c2 := make(chan int,1)
add channel cache.
I haven't used Go in a while, so this may not be the case, but from what I remember you need go to get another goroutine started, so:
go sum(allNums[:len(allNums)/2], c1)
go sum(allNums[len(allNums)/2:], c2)
If sum isn't running on another goroutine, it tries to execute:
c <- sum
But nothing's reading c; the code reading c has not been reached yet because it's waiting for sum to finish, and sum won't finish because it needs to give it to that code first!

Resources