In chapter 8 of The Go Programming Language, there is a description to the concurrency echo server as below:
The arguments to the function started by go are evaluated when the go statement itself is executed; thus input.Text() is evaluated in the main goroutine.
I don't understand this. Why the input.Text() is evaluated at the main goroutine? Shouldn't it be in the go echo() goroutine?
// Copyright © 2016 Alan A. A. Donovan & Brian W. Kernighan.
// License: https://creativecommons.org/licenses/by-nc-sa/4.0/
// See page 224.
// Reverb2 is a TCP server that simulates an echo.
package main
import (
"bufio"
"fmt"
"log"
"net"
"strings"
"time"
)
func echo(c net.Conn, shout string, delay time.Duration) {
fmt.Fprintln(c, "\t", strings.ToUpper(shout))
time.Sleep(delay)
fmt.Fprintln(c, "\t", shout)
time.Sleep(delay)
fmt.Fprintln(c, "\t", strings.ToLower(shout))
}
//!+
func handleConn(c net.Conn) {
input := bufio.NewScanner(c)
for input.Scan() {
go echo(c, input.Text(), 1*time.Second)
}
// NOTE: ignoring potential errors from input.Err()
c.Close()
}
//!-
func main() {
l, err := net.Listen("tcp", "localhost:8000")
if err != nil {
log.Fatal(err)
}
for {
conn, err := l.Accept()
if err != nil {
log.Print(err) // e.g., connection aborted
continue
}
go handleConn(conn)
}
}
code is here: https://github.com/adonovan/gopl.io/blob/master/ch8/reverb2/reverb.go
How go keyword works in Go, see
Go_statements:
The function value and parameters are evaluated as usual in the calling goroutine, but unlike with a regular call, program execution does not wait for the invoked function to complete. Instead, the function begins executing independently in a new goroutine. When the function terminates, its goroutine also terminates. If the function has any return values, they are discarded when the function completes.
The function value and parameters are evaluated in place with the go keyword (same for the defer keyword see an example for defer keyword).
To understand the evaluation order, let's try this:
go have()(fun("with Go."))
Let's run this and read the code comments for the evaluation order:
package main
import (
"fmt"
"sync"
)
func main() {
go have()(fun("with Go."))
fmt.Print("some ") // evaluation order: ~ 3
wg.Wait()
}
func have() func(string) {
fmt.Print("Go ") // evaluation order: 1
return funWithGo
}
func fun(msg string) string {
fmt.Print("have ") // evaluation order: 2
return msg
}
func funWithGo(msg string) {
fmt.Println("fun", msg) // evaluation order: 4
wg.Done()
}
func init() {
wg.Add(1)
}
var wg sync.WaitGroup
Output:
Go have some fun with Go.
Explanation go have()(fun("with Go.")):
First in place evaluation takes place here:
go have()(...) first have() part runs and the result is fmt.Print("Go ") and return funWithGo, then fun("with Go.") runs, and the result is fmt.Print("have ") and return "with Go."; now we have go funWithGo("with Go.").
So the final goroutine call is go funWithGo("with Go.")
This is a call to start a new goroutine so really we don't know when it will run. So there is a chance for the next line to run: fmt.Print("some "), then we wait here wg.Wait(). Now the goroutine runs this funWithGo("with Go.") and the result is fmt.Println("fun", "with Go.") then wg.Done(); that is all.
Let's rewrite the above code, just replace named functions with anonymous one, so this code is same as above:
For example see:
func have() func(string) {
fmt.Print("Go ") // evaluation order: 1
return funWithGo
}
And cut this code select the have part in the go have() and paste then select the have part in func have() and press Delete on the keyboard, then you'll have this:
This is even more beautiful, with the same result, just replace all functions with anonymous functions:
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
wg.Add(1)
go func() func(string) {
fmt.Print("Go ") // evaluation order: 1
return func(msg string) {
fmt.Println("fun", msg) // evaluation order: 4
wg.Done()
}
}()(func(msg string) string {
fmt.Print("have ") // evaluation order: 2
return msg
}("with Go."))
fmt.Print("some ") // evaluation order: ~ 3
wg.Wait()
}
Let me explain it with a simple example:
1. Consider this simple code:
i := 1
go fmt.Println(i) // 1
This is clear enough: the output is 1.
But if the Go designers decided to evaluate the function argument at the function run-time nobody knows the value of i; you might change the i in your code (see the next example)
Now let's do this closure:
i := 1
go func() {
time.Sleep(1 * time.Second)
fmt.Println(i) // ?
}()
The output is really unknown, and if the main goroutine exits sooner, it even won't have a chance to run: Wake up and print the i, which is i itself may change to that specific moment.
Now let's solve it like so:
i := 1
go func(i int) {
fmt.Printf("Step 3 i is: %d\n", i) // i = 1
}(i)
This anonymous function argument is of type int and it is a value type, and the value of i is known, and the compiler-generated code pushes the value 1 (i) to the stack, so this function, will use the value 1, when the time comes (A time in the future).
All (The Go Playground):
package main
import (
"fmt"
"sync"
"time"
)
func main() {
i := 1
go fmt.Println(i) // 1 (when = unknown)
go fmt.Println(2) // 2 (when = unknown)
go func() { // closure
time.Sleep(1 * time.Second)
fmt.Println(" This won't have a chance to run", i) // i = unknown (when = unknown)
}()
i = 3
wg := new(sync.WaitGroup)
wg.Add(1)
go func(i int) {
defer wg.Done()
fmt.Printf("Step 3 i is: %d\n", i) // i = 3 (when = unknown)
}(i)
i = 4
go func(step int) { // closure
fmt.Println(step, i) // i=? (when = unknown)
}(5)
i = 5
fmt.Println(i) // i=5
wg.Wait()
}
Output:
5
5 5
2
1
Step 3 i is: 3
The Go Playground output:
5
5 5
1
2
Step 3 i is: 3
As you may be noticed, the order of 1 and 2 is random, and your output may differ (See the code comments).
Related
When I add a defer in a function I expect that it will be always called when the function ends.
I noticed that it does not happen when the function is times out.
package main
import (
"context"
"fmt"
"time"
)
func service1(ctx context.Context, r *Registry) {
ctx, cancel := context.WithTimeout(ctx, 100*time.Millisecond)
defer func() {
r.Unset("service 1")
}()
r.Set("service 1")
go service2(ctx, r)
select {
case <-ctx.Done():
cancel()
break
}
}
func service2(ctx context.Context, r *Registry) {
defer func() {
r.Unset("service 2")
}()
r.Set("service 2")
time.Sleep(time.Millisecond * 300)
}
type Registry struct {
entries map[string]bool
}
func (r *Registry)Set(key string) {
r.entries[key] = true
}
func (r *Registry)Unset(key string) {
r.entries[key] = false
}
func (r *Registry)Print() {
for key, val := range r.entries {
fmt.Printf("%s -> %v\n", key, val)
}
}
func NewRegistry() *Registry {
r := Registry{}
r.entries = make(map[string]bool)
return &r
}
func main() {
r := NewRegistry()
ctx, cancel := context.WithTimeout(context.Background(), time.Millisecond*200)
go service1(ctx, r)
// go service3(ctx, r)
select {
case <-ctx.Done():
fmt.Printf("context err: %s\n", ctx.Err())
cancel()
}
r.Print()
}
In the example above, the defer in service2() is never called and that's why the output is:
service 1 -> false
service 2 -> true
instead of
service 1 -> false
service 2 -> false
I understand that timeout means "stop executing" but it's reasonable to me execute deferred code. I could not find any explanation of this behavior.
And the second part of the question - how to modify the service or Registry to be resistant to such situations?
Answer of 1st part
Say you have a function f1() which uses defer to call f2(), i.e. defer f2() . The fact is that the f2 will be called if and only if f1 completes even if a run-time panic occurs. More specifically, look at go-defer.
Now our concern is about using defer in goroutine. We also have to remember that a go-routine exits if it's parent function completes of exits.
So if we use defer in a go-routine function, then if the parent fuction completes or exits, then go-routine function must exit. Since it exits (not completes) the defer statement will not execute. It will be clear we draw the state of your program.
As you see,
at 1st millisecond, service1() completes before others. So, service2() exits without executing defer statement and 'service 2' won't be set to false. Since service1() completes, it's defer will execute and 'service 1' will be set to false.
at 2nd millisecond, the main() completes and program finishes.
So we see how this program executes.
Answer of 2nd part
One possible solution i tried is increase time in service1() or decrease time in service2().
I have a fairly simple program, which is supposed to self-terminate after a specified duration of time (for example, one second)
The code:
package main
import (
"fmt"
"time"
)
func emit(wordChannel chan string, done chan bool) {
defer close(wordChannel)
words := []string{"The", "quick", "brown", "fox"}
i := 0
t := time.NewTimer(1 * time.Second)
for {
select {
case wordChannel <- words[i]:
i++
if i == len(words) {
i = 0
}
// Please ignore the following case
case <-done:
done <- true
// fmt.Printf("Got done!\n")
close(done)
return
case <-t.C:
fmt.Printf("\n\nGot done!\n\n")
return
}
}
}
func main() {
mainWordChannel := make(chan string)
// Please ignore mainDoneChannel
mainDoneChannel := make(chan bool)
go emit(mainWordChannel, mainDoneChannel)
for word := range mainWordChannel {
fmt.Printf("%s ", word)
}
}
I compile and execute the binary, and you can see the execution here.
That is clearly above 1 second.
Go's documentation on NewTimer reads this:
func NewTimer
func NewTimer(d Duration) *Timer
NewTimer creates a new Timer that will send the current time on its channel after at least duration d.
Can someone kindly help me understand what's happening here? Why isn't the program terminating exactly (or closely at least) after 1 second?
Timer works as intended. It sends a value to a channel.
I think it is useful to read about select statement
Each iteration in your loop can write to a channel.
If > 1 communication can proceed there is no guarantee which are.
So, if add delay to read loop like this :
for word := range mainWordChannel {
fmt.Printf("%s ", word)
time.Sleep(time.Millisecond)
}
In this case, only read from timer.C can proceed. And program will end.
I recently started learning go and I am really impressed with all the features. I been playing with go routines and term-ui and facing some trouble. I am trying to exit this code from console after I run it but it just doesn't respond. If I run it without go-routine it does respond to my q key press event.
Any help is appreciated.
My code
package main
import (
"fmt"
"github.com/gizak/termui"
"time"
"strconv"
)
func getData(ch chan string) {
i := 0
for {
ch <- strconv.Itoa(i)
i++
time.Sleep(time.Second)
if i == 20 {
break
}
}
}
func Display(ch chan string) {
err := termui.Init()
if err != nil {
panic(err)
}
defer termui.Close()
termui.Handle("/sys/kbd/q", func(termui.Event) {
fmt.Println("q captured")
termui.Close()
termui.StopLoop()
})
for elem := range ch {
par := termui.NewPar(elem)
par.Height = 5
par.Width = 37
par.Y = 4
par.BorderLabel = "term ui example with chan"
par.BorderFg = termui.ColorYellow
termui.Render(par)
}
}
func main() {
ch := make(chan string)
go getData(ch)
Display(ch)
}
This is possibly the answer you are looking for. First off, you aren't using termui correctly. You need to call it's Loop function to start the Event loop so that it can actually start listening for the q key. Loop is called last because it essentially takes control of the main goroutine from then on until StopLoop is called and it quits.
In order to stop the goroutines, it is common to have a "stop" channel. Usually it is a chan struct{} to save memory because you don't ever have to put anything in it. Wherever you want the goroutine to possibly stop and shutoff (or do something else perhaps), you use a select statement with the channels you are using. This select is ordered, so it will take from them in order unless they block, in which case it tries the next one, so the stop channel usually goes first. The stop channel normally blocks, but to get it to take this path, simply close()ing it will cause this path to be chosen in the select. So we close() it in the q keyboard handler.
package main
import (
"fmt"
"github.com/gizak/termui"
"strconv"
"time"
)
func getData(ch chan string, stop chan struct{}) {
i := 0
for {
select {
case <-stop:
break
case ch <- strconv.Itoa(i):
}
i++
time.Sleep(time.Second)
if i == 20 {
break
}
}
}
func Display(ch chan string, stop chan struct{}) {
for {
var elem string
select {
case <-stop:
break
case elem = <-ch:
}
par := termui.NewPar(elem)
par.Height = 5
par.Width = 37
par.Y = 4
par.BorderLabel = "term ui example with chan"
par.BorderFg = termui.ColorYellow
termui.Render(par)
}
}
func main() {
ch := make(chan string)
stop := make(chan struct{})
err := termui.Init()
if err != nil {
panic(err)
}
defer termui.Close()
termui.Handle("/sys/kbd/q", func(termui.Event) {
fmt.Println("q captured")
close(stop)
termui.StopLoop()
})
go getData(ch, stop)
go Display(ch, stop)
termui.Loop()
}
I have the following sample code. I want to maintain 4 goroutines running at all times. They have the possibility of panicking. In the case of the panic, I have a recover where I restart the goroutine.
The way I implemented works but I am not sure whether its the correct and proper way to do this. Any thoughts
package main
import (
"fmt"
"time"
)
var gVar string
var pCount int
func pinger(c chan int) {
for i := 0; ; i++ {
fmt.Println("adding ", i)
c <- i
}
}
func printer(id int, c chan int) {
defer func() {
if err := recover(); err != nil {
fmt.Println("HERE", id)
fmt.Println(err)
pCount++
if pCount == 5 {
panic("TOO MANY PANICS")
} else {
go printer(id, c)
}
}
}()
for {
msg := <-c
fmt.Println(id, "- ping", msg, gVar)
if msg%5 == 0 {
panic("PANIC")
}
time.Sleep(time.Second * 1)
}
}
func main() {
var c chan int = make(chan int, 2)
gVar = "Preflight"
pCount = 0
go pinger(c)
go printer(1, c)
go printer(2, c)
go printer(3, c)
go printer(4, c)
var input string
fmt.Scanln(&input)
}
You can extract the recover logic in a function such as:
func recoverer(maxPanics, id int, f func()) {
defer func() {
if err := recover(); err != nil {
fmt.Println("HERE", id)
fmt.Println(err)
if maxPanics == 0 {
panic("TOO MANY PANICS")
} else {
go recoverer(maxPanics-1, id, f)
}
}
}()
f()
}
And then use it like:
go recoverer(5, 1, func() { printer(1, c) })
Like Zan Lynx's answer, I'd like to share another way to do it (although it's pretty much similar to OP's way.) I used an additional buffered channel ch. When a goroutine panics, the recovery function inside the goroutine send its identity i to ch. In for loop at the bottom of main(), it detects which goroutine's in panic and whether to restart by receiving values from ch.
Run in Go Playground
package main
import (
"fmt"
"time"
)
func main() {
var pCount int
ch := make(chan int, 5)
f := func(i int) {
defer func() {
if err := recover(); err != nil {
ch <- i
}
}()
fmt.Printf("goroutine f(%v) started\n", i)
time.Sleep(1000 * time.Millisecond)
panic("goroutine in panic")
}
go f(1)
go f(2)
go f(3)
go f(4)
for {
i := <-ch
pCount++
if pCount >= 5 {
fmt.Println("Too many panics")
break
}
fmt.Printf("Detected goroutine f(%v) panic, will restart\n", i)
f(i)
}
}
Oh, I am not saying that the following is more correct than your way. It is just another way to do it.
Create another function, call it printerRecover or something like it, and do your defer / recover in there. Then in printer just loop on calling printerRecover. Add in function return values to check if you need the goroutine to exit for some reason.
The way you implemented is correct. Just for me the approach to maintain exactly 4 routines running at all times looks not much go_way, either handling routine's ID, either spawning in defer which may leads unpredictable stack due to closure. I don't think you can efficiently balance resource this way. Why don't you like to simple spawn worker when it needed
func main() {
...
go func(tasks chan int){ //multiplexer
for {
task = <-tasks //when needed
go printer(task) //just spawns handler
}
}(ch)
...
}
and let runtime do its job? This way things are done in stdlib listeners/servers and them known to be efficient enough. goroutines are very lightweight to spawn and runtime is quite smart to balance load. Sure you must to recover either way. It is my very personal opinion.
This code selects all xml files in the same folder, as the invoked executable and asynchronously applies processing to each result in the callback method (in the example below, just the name of the file is printed out).
How do I avoid using the sleep method to keep the main method from exiting? I have problems wrapping my head around channels (I assume that's what it takes, to synchronize the results) so any help is appreciated!
package main
import (
"fmt"
"io/ioutil"
"path"
"path/filepath"
"os"
"runtime"
"time"
)
func eachFile(extension string, callback func(file string)) {
exeDir := filepath.Dir(os.Args[0])
files, _ := ioutil.ReadDir(exeDir)
for _, f := range files {
fileName := f.Name()
if extension == path.Ext(fileName) {
go callback(fileName)
}
}
}
func main() {
maxProcs := runtime.NumCPU()
runtime.GOMAXPROCS(maxProcs)
eachFile(".xml", func(fileName string) {
// Custom logic goes in here
fmt.Println(fileName)
})
// This is what i want to get rid of
time.Sleep(100 * time.Millisecond)
}
You can use sync.WaitGroup. Quoting the linked example:
package main
import (
"net/http"
"sync"
)
func main() {
var wg sync.WaitGroup
var urls = []string{
"http://www.golang.org/",
"http://www.google.com/",
"http://www.somestupidname.com/",
}
for _, url := range urls {
// Increment the WaitGroup counter.
wg.Add(1)
// Launch a goroutine to fetch the URL.
go func(url string) {
// Decrement the counter when the goroutine completes.
defer wg.Done()
// Fetch the URL.
http.Get(url)
}(url)
}
// Wait for all HTTP fetches to complete.
wg.Wait()
}
WaitGroups are definitely the canonical way to do this. Just for the sake of completeness, though, here's the solution that was commonly used before WaitGroups were introduced. The basic idea is to use a channel to say "I'm done," and have the main goroutine wait until each spawned routine has reported its completion.
func main() {
c := make(chan struct{}) // We don't need any data to be passed, so use an empty struct
for i := 0; i < 100; i++ {
go func() {
doSomething()
c <- struct{}{} // signal that the routine has completed
}()
}
// Since we spawned 100 routines, receive 100 messages.
for i := 0; i < 100; i++ {
<- c
}
}
sync.WaitGroup can help you here.
package main
import (
"fmt"
"sync"
"time"
)
func wait(seconds int, wg * sync.WaitGroup) {
defer wg.Done()
time.Sleep(time.Duration(seconds) * time.Second)
fmt.Println("Slept ", seconds, " seconds ..")
}
func main() {
var wg sync.WaitGroup
for i := 0; i <= 5; i++ {
wg.Add(1)
go wait(i, &wg)
}
wg.Wait()
}
Although sync.waitGroup (wg) is the canonical way forward, it does require you do at least some of your wg.Add calls before you wg.Wait for all to complete. This may not be feasible for simple things like a web crawler, where you don't know the number of recursive calls beforehand and it takes a while to retrieve the data that drives the wg.Add calls. After all, you need to load and parse the first page before you know the size of the first batch of child pages.
I wrote a solution using channels, avoiding waitGroup in my solution the the Tour of Go - web crawler exercise. Each time one or more go-routines are started, you send the number to the children channel. Each time a go routine is about to complete, you send a 1 to the done channel. When the sum of children equals the sum of done, we are done.
My only remaining concern is the hard-coded size of the the results channel, but that is a (current) Go limitation.
// recursionController is a data structure with three channels to control our Crawl recursion.
// Tried to use sync.waitGroup in a previous version, but I was unhappy with the mandatory sleep.
// The idea is to have three channels, counting the outstanding calls (children), completed calls
// (done) and results (results). Once outstanding calls == completed calls we are done (if you are
// sufficiently careful to signal any new children before closing your current one, as you may be the last one).
//
type recursionController struct {
results chan string
children chan int
done chan int
}
// instead of instantiating one instance, as we did above, use a more idiomatic Go solution
func NewRecursionController() recursionController {
// we buffer results to 1000, so we cannot crawl more pages than that.
return recursionController{make(chan string, 1000), make(chan int), make(chan int)}
}
// recursionController.Add: convenience function to add children to controller (similar to waitGroup)
func (rc recursionController) Add(children int) {
rc.children <- children
}
// recursionController.Done: convenience function to remove a child from controller (similar to waitGroup)
func (rc recursionController) Done() {
rc.done <- 1
}
// recursionController.Wait will wait until all children are done
func (rc recursionController) Wait() {
fmt.Println("Controller waiting...")
var children, done int
for {
select {
case childrenDelta := <-rc.children:
children += childrenDelta
// fmt.Printf("children found %v total %v\n", childrenDelta, children)
case <-rc.done:
done += 1
// fmt.Println("done found", done)
default:
if done > 0 && children == done {
fmt.Printf("Controller exiting, done = %v, children = %v\n", done, children)
close(rc.results)
return
}
}
}
}
Full source code for the solution
Here is a solution that employs WaitGroup.
First, define 2 utility methods:
package util
import (
"sync"
)
var allNodesWaitGroup sync.WaitGroup
func GoNode(f func()) {
allNodesWaitGroup.Add(1)
go func() {
defer allNodesWaitGroup.Done()
f()
}()
}
func WaitForAllNodes() {
allNodesWaitGroup.Wait()
}
Then, replace the invocation of callback:
go callback(fileName)
With a call to your utility function:
util.GoNode(func() { callback(fileName) })
Last step, add this line at the end of your main, instead of your sleep. This will make sure the main thread is waiting for all routines to finish before the program can stop.
func main() {
// ...
util.WaitForAllNodes()
}