Trying to understand goroutines - go

I've been playing around with the following code from A Tour of Go, but I don't understand what is going on when I apply some minor changes. The original code is this
package main
import (
"fmt"
"time"
)
func say(s string) {
for i := 0; i < 5; i++ {
time.Sleep(100 * time.Millisecond)
fmt.Println(s)
}
}
func main() {
go say("world")
say("hello")
}
and it produces this
world
hello
hello
world
world
hello
hello
world
world
hello
which is OK: five times hello, five times world. I starts to get strange when I call
say("world")
go say("hello")
Now the output is just
world
world
world
world
world
No hello whatsoever. It gets even weirder with two goroutines
go say("world")
go say("hello")
Now there is no output at all. When I change i < 5 to i < 2 and call
go say("world")
say("hello")
I get
world
hello
hello
What am I missing here?

In the case of
say("world")
go say("hello")
The "world" call must complete before the "hello" goroutine is started. The "hello" goroutine does not run or complete because main returns.
For
go say("world")
go say("hello")
the goroutines do not run or complete because main returns.
Use sync.WaitGroup to prevent main from exiting before the goroutines complete:
func say(wg *sync.WaitGroup, s string) {
defer wg.Done()
for i := 0; i < 5; i++ {
time.Sleep(100 * time.Millisecond)
fmt.Println(s)
}
}
func main() {
var wg sync.WaitGroup
wg.Add(2)
go say(&wg, "world")
go say(&wg, "hello")
wg.Wait()
}
playground example

Congratulations for learning Go. As someone new, it is nice to understand concurrency and how it is different from parallelism.
Concurrency
Concurrency is like a juggler juggling several balls in the air with one hand. No matter how many balls he is juggling, only one ball touch his hand at any moment.
Parallelism
When the juggler starts juggling more balls with another hand in parallel, we have two concurrent processes running at the same time.
Goroutines are great because they're both concurrent and auto-parallel, depending on the computing cores available and the GOMAXPROCS variable being set.
The One-handed Juggler
Back to the one-handed, single-cored, concurrent juggler. Imagine him juggling three balls named "hello", "world", and "mars" respectively with the hand being the main routine.
var balls = []string{"hello", "world", "mars"}
func main() {
go say(balls[0])
go say(balls[1])
go say(balls[2])
}
Or more appropriately,
func main() {
for _, ball := range balls {
go say(ball)
}
}
Once the three balls are thrown up into the air sequentially, the juggler simply retreats his hand right away. That is, the main routine exits before the first ball thrown can even land on his hand. Shame, the balls just drop to the ground. Bad show.
In order to get the balls back in his hand, the juggler has to make sure he waits for them. This means his hand needs to be able to keep track of and count the balls he threw and learn when each is landing.
The most straightforward way is to use sync.WaitGroup:
import (
"fmt"
"time"
"sync"
)
var balls = []string{"hello", "world", "mars"}
var wg sync.WaitGroup
func main() {
for _, ball := range balls {
// One ball thrown
wg.Add(1)
go func(b string) {
// Signals the group that this routine is done.
defer wg.Done()
// each ball hovers for 1 second
time.Sleep(time.Duration(1) * time.Second)
fmt.Println(b)
// wg.Done() is called before goroutine exits
}(ball)
}
// The juggler can do whatever he pleases while the
// balls are hanging in the air.
// This hand will come back to grab the balls after 1s.
wg.Wait()
}
WaitGroup is simple. When a goroutine is spawned, one adds to a "backlog counter" with WaitGroup.Add(1) and call WaitGroup.Done() to decrease the counter. Once the backlog becomes 0, it means that all goroutines are done and WaitGroup should stop waiting (and grab the balls!).
While using channel(s) for synchronization is fine, it is encouraged to use available concurrent tools as appropriate especially when the use of channels make the code more complex and hard to comprehend.

It is because the main function has been exited.
When main function return, all goroutines are abruptly terminated, then program exits.
You add a statment:
time.Sleep(100 * time.Second)
before main function return, and everything goes well.
But a good practice in Go is to use channel, which is used to communicate between goroutines. You can use it to let main function wait for background goroutines to finish.

With
func main() {
go say("world")
say("hello")
}
You are creating two separate goroutines, one is the main functions goroutine and one is the go say("world"). Normally when functions are executed the programs jumps to that function, execute all code inside and then jumps to the line after where the function was called from.
With goroutines you are not jumping inside the function but you are starting the goroutine in a separate thread and continuing to execute the line just after the call without waiting for it.
Therefore the goroutine will not have time to finish before the main goroutine is done.

Related

Tour of Go Exercise #1: Concurrency and the go keyword

I'm going through 'A Tour of Go' and have been editing most of the lessons to make sure I fully understand them. I have a question regarding: https://tour.golang.org/concurrency/1
package main
import (
"fmt"
"time"
)
func say(s string) {
for i := 0; i < 5; i++ {
time.Sleep(100 * time.Millisecond)
fmt.Println(s)
}
}
func main() {
go say("world")
say("hello")
}
Leaving main the way it is produces a random ordering of hellos and worlds because the threads are executing in different orders each time the program runs. I have two questions:
If I remove go from the line with world and add it to the line with hello, world is printed 5 times and hello is not printed at all. Why is that?
If I add go in front of both lines, nothing is printed at all. Why is that?
I have some experience with concurrency using C++ (although it was a while ago) and some more recent experience with Python, but would describe my overall experience with concurrency fairly novice-level.
Thanks!
The program terminates before you get a chance to see the results.
You can fix this by adding a statement that ensures main doesn't exit before the other routines are finished.
From Go - Concurrency:
With a goroutine we return immediately to the next line and don't wait for the function to complete.
They give a code example:
package main
import "fmt"
func f(n int) {
for i := 0; i < 10; i++ {
fmt.Println(n, ":", i)
}
}
func main() {
go f(0)
var input string
fmt.Scanln(&input)
}
In regards to the code above:
This is why the call to the Scanln function has been included; without it the program would exit before being given the opportunity to print all the numbers.
If I remove the go keyword from the line say("world") and add it to the line say("hello"), world is printed 5 times and hello is not printed at all. Why is that?
If I add the go in front of both lines, nothing is printed at all. Why is that?
In both cases the same problem occurs, you are executing unsynchronized operations. It results in your program returning from main before all started work was processed. The specification states that When that [main] function invocation returns, the program exits. It does not wait for other (non-main) goroutines to complete.
When you use the go keyword, the runtime starts the execution of a function call as an independent concurrent thread of control. You need to use the synchronization language primitives to re synchronize the exit order of operations following the call to main with the remaining asynchronous jobs.
The language offers channels or otherwise WaitGroups via the sync package to help you implement that behavior.
For example
package main
import (
"fmt"
"time"
"sync"
)
var wg sync.WaitGroup
func say(s string) {
for i := 0; i < 5; i++ {
time.Sleep(100 * time.Millisecond)
fmt.Println(s)
wg.Done()
}
}
func main() {
wg.Add(5)
say("world")
wg.Add(5)
say("hello")
wg.Wait()
}

go tutorial select statement

I'm working through the examples at tour.golang.org, and I've encountered this code I don't really understand:
package main
import "fmt"
func fibonacci(c, quit chan int) {
x, y := 0, 1
for {
select {
case c <- x: // case: send x to channel c?
x, y = y, x+y
case <-quit: // case: receive from channel quit?
fmt.Println("quit")
return
}
}
}
func main() {
c := make(chan int)
quit := make(chan int)
go func() { // when does this get called?
for i := 0; i < 10; i++ {
fmt.Println(<-c)
}
quit <- 0
}()
fibonacci(c, quit)
}
I understand the basics of how channels work, but what I don't get is how the above select statement works. The explanation on the tutorial says:
"The select statement lets a goroutine wait on multiple communication operations.
A select blocks until one of its cases can run, then it executes that case. It chooses one at random if multiple are ready."
But how are the cases getting executed? From what I can tell, they're saying:
case: send x to channel c
case: receive from quit
I think I understand that the second one executes only if quit has a value, which is done later inside the go func(). But what is the first case checking for? Also, inside the go func(), we're apparently printing values from c, but c shouldn't have anything in it at that point? The only explanation I can think of is that the go func() somehow executes after the call to fibonacci(). I'm guessing it's a goroutine which I don't fully understand either, it just seems like magic.
I'd appreciate if someone could go through this code and tell me what it's doing.
Remember that channels will block, so the select statement reads:
select {
case c <- x: // if I can send to c
// update my variables
x, y = y, x+y
case <-quit: // If I can receive from quit then I'm supposed to exit
fmt.Println("quit")
return
}
The absence of a default case means "If I can't send to c and I can't read from quit, block until I can."
Then in your main process you spin off another function that reads from c to print the results
for i:=0; i<10; i++ {
fmt.Println(<-c) // read in from c
}
quit <- 0 // send to quit to kill the main process.
The key here is to remember that channels block, and you're using two unbuffered channels. Using go to spin off the second function lets you consume from c so fibonacci will continue.
Goroutines are so-called "green threads." Starting a function call with the keyword go spins it off into a new process that runs independent of the main line of execution. In essence, main() and go func() ... are running simultaneously! This is important since we're using a producer/consumer pattern in this code.
fibonacci produces values and sends them to c, and the anonymous goroutine that's spawned from main consumes values from c and processes them (in this case, "processing them" just means printing to the screen). We can't simply produce all the values and then consume them, because c will block. Furthermore fibonacci will produce more values forever (or until integer overflow anyway) so even if you had a magic channel that had an infinitely long buffer, it would never get to the consumer.
There are two key things to understanding this code example:
First, let's review how unbuffered channels work. From the documentation
If the channel is unbuffered, the sender blocks until the receiver has
received the value.
Note that both channels in the code example, c and quit are unbuffered.
Secondly, when we use the go keyword to start a new goroutine, the execution will happen in parallel with other routines. So in the example, we have two go routines running: the routine started by func main(), and the routine started by go func()... inside the func main().
I added some inline comments here which should make things clearer:
package main
import "fmt"
func fibonacci(c, quit chan int) {
x, y := 0, 1
for { // this is equivalent to a while loop, without a stop condition
select {
case c <- x: // when we can send to channel c, and because c is unbuffered, we can only send to channel c when someone tries to receive from it
x, y = y, x+y
case <-quit: // when we can receive from channel quit, and because quit is unbuffered, we can only receive from channel quit when someone tries to send to it
fmt.Println("quit")
return
}
}
}
func main() {
c := make(chan int)
quit := make(chan int)
go func() { // this runs in another goroutine, separate from the main goroutine
for i := 0; i < 10; i++ {
fmt.Println(<-c)
}
quit <- 0
}()
fibonacci(c, quit) // this doesn't start with the go keyword, so it will run on the go routine started by func main()
}
You've pretty much got it.
inside the go func(), we're apparently printing values from c, but c shouldn't have anything in it at that point? The only explanation I can think of is that the go func() somehow executes after the call to fibonacci(). I'm guessing it's a goroutine
Yes, the go keyword starts a goroutine, so the func() will run at the same time as the fibonacci(c, quit). The receive from the channel in the Println simply blocks until there is something to receive

Need help to understand this weird bahaviour of go routines

I've following code using go routines:
package main
import (
"fmt"
"time"
)
func thread_1(i int) {
time.Sleep(time.Second * 1)
fmt.Println("thread_1: i: ",i)
}
func thread_2(i int) {
time.Sleep(time.Second * 1)
fmt.Println("thread_2: i: ",i)
}
func main() {
for i := 0; i < 100; i++ {
go thread_1(i)
go thread_2(i)
}
var input string
fmt.Scanln(&input)
}
I was expecting each go routine would wait for a second and then print its value of i.
However, both the go routines waited for 1 second each and printed all the values of i at once.
Another question in the same context is:
I've a server-client application where server spawns a receiver go routine to receive data from client one for each client connection. Then the receiver go routine spawns a worker go routine called processor to process the data. There could be multiple receiver go routines and hence multiple processor go routines.
In such a case some receiver and processor go routines may hog indefinitely.
Please help me understand this behaviour.
You span 100 goroutines running thread_1 and 100 goroutines running thread_2 in one big batch. Each of these 200 goroutines sleeps for one second and then prints and ends. So yes, the behavior is to be expected: 200 goroutines sleeping each 1 second in parallel and then 200 goroutines printing in parallel.
(And I do not understand your second question)

Gogoutine schedule algorithm

package main
import ()
func main() {
msgQueue := make(chan int, 1000000)
netAddr := "127.0.0.1"
token := make(chan int, 10)
for i := 0; i < 10; i++ {
token <- i
}
go RecvReq(netAddr, msgQueue)
for {
select {
case req := <-msgQueue:
go HandleReq(req, token)
}
}
}
func RecvReq(addr string,msgQueue chan int){
msgQueue<-//get from network
}
func HandleReq(msg int, token chan int) {
//step 1
t := <-token
//step 2
//codo here...(don't call runtime.park)
//step 3
//code here...(may call runtime.park)
//step 4
token <- t
}
System: 1cpu 2core
Go version:go1.3 linux/amd64
Problem description:
msgQueue revc request all the time by RecvReq,then the main goroutine create new goroutine all the time,but the waiting goroutine wait all the time.The first 10 goroutines stop at step 3,new goroutines followed stop at step 1.
Q1:How to make the waiting goroutine to run when new goroutine is being created all the time.
Q2:How to balance RevcReq and HandleReq? Revc msg rate is 10 times faster than Handle msg.
Alas this is not very clear from your question. But there are several issues here.
You create a buffered channel of size n then insert n items into it. Don't do this - or to be clearer, don't do this until you know it's needed. Buffered channels usually fall into the 'premature optimisation' category. Start with unbuffered channels so you can work out how the goroutines co-operate. When it's working (free of deadlocks), measure the performance, add buffering, try again.
Your select has only one guard. So it behaves just like the select wasn't there and the case body was the only code there.
You are trying to spawn off new goroutines for every message. Is this really what you wanted? You may find you can use a static mesh of goroutines, perhaps 10 in your case, and the result may be a program in which the intent is clearer. It would also give a small saving because the runtime would not have to spawn and clean up goroutines dynamically (however, you should be concerned with correct behaviour first, before worrying about any inefficiencies).
Your RecvReq is missing from the playground example, which is not executable.

Why do I need to run Walk with a new subroutine?

I’m writing the Walk function in the go tutorial that basically traverses a tree in-order. What I have works:
package main
import (
"fmt"
"code.google.com/p/go-tour/tree"
)
// Walk walks the tree t sending all values
// from the tree to the channel ch.
func Walk__helper(t *tree.Tree, ch chan int) {
if (t == nil) {
return
}
Walk__helper(t.Left, ch)
ch <- t.Value
Walk__helper(t.Right, ch)
}
func Walk(t *tree.Tree, ch chan int) {
Walk__helper(t, ch)
close(ch)
}
func main() {
ch := make(chan int)
go Walk(tree.New(1), ch)
for v := range ch {
fmt.Println(v)
}
}
Why must I use go Walk(tree.New(1), ch) instead of just Walk(tree.New(1), ch)?
I was under the impression that the go keyword basically spawns a new thread. In that case, we’d run into issues since the for loop might run before the subroutine completes.
Strangely, when I take out the go keyword, I get a deadlock. This is rather counterintuitive to me. What exactly is the go keyword doing here?
The key point here is range when coupled with a channel.
When you range over a channel (in this case, ch), it will wait for items to be sent on the channel before iterating through the loop. This is a safe, "blocking" action, that will not deadlock while it waits for the channel to receive an item.
The deadlock occurs when not using a goroutine because your channel isn't buffered. If you don't use a goroutine, then the method call is synchronous, Walk puts something on the channel.. and it blocks until that is popped off. It never gets popped off.. because the method call was synchronous.
I was under the impression that the go keyword basically spawns a new thread
..that is incorrect. There are many more important implementation details required to understand what goes on there. You should separate your thought process of a goroutine from a thread.. and just think of a goroutine as a concurrently executing piece of code, without a "thread".

Resources