I’m trying to catch crashes/panics from go routines that are created in my program, in order to send them to my crash-error-reporting server (such as Sentry/Raygun)
For example,
func main() {
go func() {
// Get this panic
panic("Go routine panic")
}()
}
The answer states a goroutine cannot recover from a panic in another goroutine.
What would be the idiomatic way to go about it?
You have to "inject" some code into the function that is launched as a new goroutine: you have to call a deferred function in which you call recover(). This is the only way to recover from a panicing state. See related: Why does `defer recover()` not catch panics?
For example:
go func() {
defer func() {
if r := recover(); r != nil {
fmt.Println("Caught:", r)
}
}()
panic("catch me")
}()
This will output (try it on the Go Playground):
Caught: catch me
It is unfeasible to do this in every goroutine you launch, but of course you can move the recovering-logging functionality to a named function, and just call that (but deferred of course):
func main() {
go func() {
defer logger()
panic("catch me")
}()
time.Sleep(time.Second)
}
func logger() {
if r := recover(); r != nil {
fmt.Println("Caught:", r)
}
}
This will output the same (try it on the Go Playground).
Yet another, more convenient and even more compact solution is to create a utility function, a "wrapper" which receives the function, and takes care of the recovering.
This is how it could look like:
func wrap(f func()) {
defer func() {
if r := recover(); r != nil {
fmt.Println("Caught:", r)
}
}()
f()
}
And now using it is even simpler:
go wrap(func() {
panic("catch me")
})
go wrap(func() {
panic("catch me too")
})
It will output (try it on the Go Playground):
Caught: catch me
Caught: catch me too
Final note:
Note that launching an actual goroutine happens outside of wrap(). This gives the caller the option to decide if a new goroutine is required just by prefixing the wrap() call with go. Usually this approach is preferred in Go. This allows you to execute arbitrary functions by passing them to wrap(), and it will "protect" its execution (by recovering from panics, properly logging / reporting it) even if you do not wish to run it concurrently in a new goroutine. On the other hand if you'd move go inside wrap() it wouldn't even work anymore as the recover() call would not happen on the panicking goroutine.
Related
I was referred to this question: Program recovered from panic does not exit as expected
It works fine but it relies on knowing where the panic occurs in order to place the deferred function.
My code is as follows.
package main
import "fmt"
func main() {
defer recoverPanic()
f1()
f2()
f3()
}
func f1() {
fmt.Println("f1")
}
func f2() {
defer f3() //<--- don't want to defer f3 here because I might not know f2 will panic, panic could occuer elsewhere
fmt.Println("f2")
panic("f2")
}
func f3() {
fmt.Println("f3")
}
func recoverPanic() {
if r := recover(); r != nil {
fmt.Printf("Cause of panic ==>> %q\n", r)
}
}
Having the deferred function call f3() in the panicking function works, output below.
f1
f2
f3
Cause of panic ==>> "f2"
What if you have an application where you don't know where a panic occurs, do I need to put a defer in every function that might panic?
Commenting out the defer f3() gives me the following output.
f1
f2
Cause of panic ==>> "f2"
f3 never runs.
My question is how to continue execution of the program without having a deferred function call in every function that might panic?
You can't resume function execution after a panic. Panic is used when the current line of execution cannot continue correctly. Arbitrarily resuming execution after a panic (if it were possible) is begging immediately for another panic, because the state is already incorrect and just blazing ahead won't fix that.
For example, let's say a function panics when it tries to read out of bounds on a slice. How can it continue? What would it even mean to continue? Should it just read the out of bounds memory location and get garbage data? Continue with a zero value? Take a different value from the slice?
You must handle error cases; either by explicitly recovering, or preemptively checking / correcting conditions that will result in panic. At least in the standard library, functions that may spur a panic will say so in their documentation with an explanation of which conditions will result in panic.
If you commonly need to safely call void functions and recover from any panics, you can make a simple wrapper function for that.
func try(f func()) {
defer func() {
if err := recover(); err != nil {
fmt.Println("caught panic:", err)
}
}()
f()
}
Then
func main() {
try(f1)
try(f2)
try(f3)
}
func GoCountColumns(in chan []string, r chan Result, quit chan int) {
for {
select {
case data := <-in:
r <- countColumns(data) // some calculation function
case <-quit:
return // stop goroutine
}
}
}
func main() {
fmt.Println("Welcome to the csv Calculator")
file_path := os.Args[1]
fd, _ := os.Open(file_path)
reader := csv.NewReader(bufio.NewReader(fd))
var totalColumnsCount int64 = 0
var totallettersCount int64 = 0
linesCount := 0
numWorkers := 10000
rc := make(chan Result, numWorkers)
in := make(chan []string, numWorkers)
quit := make(chan int)
t1 := time.Now()
for i := 0; i < numWorkers; i++ {
go GoCountColumns(in, rc, quit)
}
//start worksers
go func() {
for {
record, err := reader.Read()
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
if linesCount%1000000 == 0 {
fmt.Println("Adding to the channel")
}
in <- record
//data := countColumns(record)
linesCount++
//totalColumnsCount = totalColumnsCount + data.ColumnCount
//totallettersCount = totallettersCount + data.LettersCount
}
close(in)
}()
for i := 0; i < numWorkers; i++ {
quit <- 1 // quit goroutines from main
}
close(rc)
for i := 0; i < linesCount; i++ {
data := <-rc
totalColumnsCount = totalColumnsCount + data.ColumnCount
totallettersCount = totallettersCount + data.LettersCount
}
fmt.Printf("I counted %d lines\n", linesCount)
fmt.Printf("I counted %d columns\n", totalColumnsCount)
fmt.Printf("I counted %d letters\n", totallettersCount)
elapsed := time.Now().Sub(t1)
fmt.Printf("It took %f seconds\n", elapsed.Seconds())
}
My Hello World is a program that reads a csv file and passes it to a channel. Then the goroutines should consume from this channel.
My Problem is I have no idea how to detect from the main thread that all data was processed and I can exit my program.
on top of other answers.
Take (great) care that closing a channel should happen on the write call site, not the read call site. In GoCountColumns the r channel being written, the responsibility to close the channel are onto GoCountColumns function. Technical reasons are, it is the only actor knowing for sure that the channel will not being written anymore and thus is safe for close.
func GoCountColumns(in chan []string, r chan Result, quit chan int) {
defer close(r) // this line.
for {
select {
case data := <-in:
r <- countColumns(data) // some calculation function
case <-quit:
return // stop goroutine
}
}
}
The function parameters naming convention, if i might say, is to have the destination as first parameter, the source as second, and others parameters along. The GoCountColumns is preferably written:
func GoCountColumns(dst chan Result, src chan []string, quit chan int) {
defer close(dst)
for {
select {
case data := <-src:
dst <- countColumns(data) // some calculation function
case <-quit:
return // stop goroutine
}
}
}
You are calling quit right after the process started. Its illogical. This quit command is a force exit sequence, it should be called once an exit signal is detected, to force exit the current processing in best state possible, possibly all broken. In other words, you should be relying on the signal.Notify package to capture exit events, and notify your workers to quit. see https://golang.org/pkg/os/signal/#example_Notify
To write better parallel code, list at first the routines you need to manage the program lifetime, identify those you need to block onto to ensure the program has finished before exiting.
In your code, exists read, map. To ensure complete processing, the program main function must ensure that it captures a signal when map exits before exiting itself. Notice that the read function does not matter.
Then, you will also need the code required to capture an exit event from user input.
Overall, it appears we need to block onto two events to manage lifetime. Schematically,
func main(){
go read()
go map(mapDone)
go signal()
select {
case <-mapDone:
case <-sig:
}
}
This simple code is good to process or die. Indeed, when the user event is caught, the program exits immediately, without giving a chance to others routines to do something required upon stop.
To improve those behaviors, you need first a way to signal the program wants to leave to other routines, second, a way to wait for those routines to finish their stop sequence before leaving.
To signal exit event, or cancellation, you can make use of a context.Context, pass it around to the workers, make them listen to it.
Again, schematically,
func main(){
ctx,cancel := context.WithCancel(context.WithBackground())
go read(ctx)
go map(ctx,mapDone)
go signal()
select {
case <-mapDone:
case <-sig:
cancel()
}
}
(more onto read and map later)
To wait for completion, many things are possible, for as long as they are thread safe. Usually, a sync.WaitGroup is being used. Or, in cases like yours where there is only one routine to wait for, we can re use the current mapDone channel.
func main(){
ctx,cancel := context.WithCancel(context.WithBackground())
go read(ctx)
go map(ctx,mapDone)
go signal()
select {
case <-mapDone:
case <-sig:
cancel()
<-mapDone
}
}
That is simple and straight forward. But it is not totally correct. The last mapDone chan might block forever and make the program unstoppable. So you might implement a second signal handler, or a timeout.
Schematically, the timeout solution is
func main(){
ctx,cancel := context.WithCancel(context.WithBackground())
go read(ctx)
go map(ctx,mapDone)
go signal()
select {
case <-mapDone:
case <-sig:
cancel()
select {
case <-mapDone:
case <-time.After(time.Second):
}
}
}
You might also accumulate a signal handling and a timeout in the last select.
Finally, there are few things to tell about read and map context listening.
Starting with map, the implementation requires to read for context.Done channel regularly to detect cancellation.
It is the easy part, it requires to only update the select statement.
func GoCountColumns(ctx context.Context, dst chan Result, src chan []string) {
defer close(dst)
for {
select {
case <-ctx.Done():
<-time.After(time.Minute) // do something more useful.
return // quit. Notice the defer will be called.
case data := <-src:
dst <- countColumns(data) // some calculation function
}
}
}
Now the read part is bit more tricky as it is an IO it does not provide a selectable programming interface and listening to the context channel cancellation might seem contradictory. It is. As IOs are blocking, impossible to listen the context. And while reading from the context channel, impossible to read the IO. In your case, the solution requires to understand that your read loop is not relevant to your program lifetime (recall we only listen onto mapDone?), and that we can just ignore the context.
In other cases, if for example you wanted to restart at last byte read (so at every read, we increment an n, counting bytes, and we want to save that value upon stop). Then, a new routine is required to be started, and thus, multiple routines are to wait for completion. In such cases a sync.WaitGroup will be more appropriate.
Schematically,
func main(){
var wg sync.WaitGroup
processDone:=make(chan struct{})
ctx,cancel := context.WithCancel(context.WithBackground())
go read(ctx)
wg.Add(1)
go saveN(ctx,&wg)
wg.Add(1)
go map(ctx,&wg)
go signal()
go func(){
wg.Wait()
close(processDone)
}()
select {
case <-processDone:
case <-sig:
cancel()
select {
case <-processDone:
case <-time.After(time.Second):
}
}
}
In this last code, the waitgroup is being passed around. Routines are responsible to call for wg.Done(), when all routines are done, the processDone channel is closed, to signal the select.
func GoCountColumns(ctx context.Context, dst chan Result, src chan []string, wg *sync.WaitGroup) {
defer wg.Done()
defer close(dst)
for {
select {
case <-ctx.Done():
<-time.After(time.Minute) // do something more useful.
return // quit. Notice the defer will be called.
case data := <-src:
dst <- countColumns(data) // some calculation function
}
}
}
It is undecided which patterns is preferred, but you might also see waitgroup being managed at call sites only.
func main(){
var wg sync.WaitGroup
processDone:=make(chan struct{})
ctx,cancel := context.WithCancel(context.WithBackground())
go read(ctx)
wg.Add(1)
go func(){
defer wg.Done()
saveN(ctx)
}()
wg.Add(1)
go func(){
defer wg.Done()
map(ctx)
}()
go signal()
go func(){
wg.Wait()
close(processDone)
}()
select {
case <-processDone:
case <-sig:
cancel()
select {
case <-processDone:
case <-time.After(time.Second):
}
}
}
Beyond all of that and OP questions, you must always evaluate upfront the pertinence of parallel processing for a given task. There is no unique recipe, practice and measure your code performances. see pprof.
There is way too much going on in this code. You should restructure your code into short functions that serve specific purposes to make it possible for someone to help you out easily (and help yourself as well).
You should read the following Go article, which goes into concurrency patterns:
https://blog.golang.org/pipelines
There are multiple ways to make one go-routine wait on some other work to finish. The most common ways are with wait groups (example I have provided) or channels.
func processSomething(...) {
...
}
func main() {
workers := &sync.WaitGroup{}
for i := 0; i < numWorkers; i++ {
workers.Add(1) // you want to call this from the calling go-routine and before spawning the worker go-routine
go func() {
defer workers.Done() // you want to call this from the worker go-routine when the work is done (NOTE the defer, which ensures it is called no matter what)
processSomething(....) // your async processing
}()
}
// this will block until all workers have finished their work
workers.Wait()
}
You can use a channel to block main until completion of a goroutine.
package main
import (
"log"
"time"
)
func main() {
c := make(chan struct{})
go func() {
time.Sleep(3 * time.Second)
log.Println("bye")
close(c)
}()
// This blocks until the channel is closed by the routine
<-c
}
No need to write anything into the channel. Reading is blocked until data is read or, which we use here, the channel is closed.
I'm building an app to run a command every time the code changes. I'm using a fsnotify for this feature. But, I can't understand how it is waiting a main goroutine.
I found that using sync.WaitGroup is more idiomatic, but I'm curious how done chan bool makes a goroutine is waiting in fsnotify example code.
I've tried to remove done in the example code of fsnotify, but it's not waiting a goroutine, just is exited.
watcher, err := fsnotify.NewWatcher()
if err != nil {
log.Fatal(err)
}
defer watcher.Close()
done := make(chan bool)
go func() {
for {
select {
case event, ok := <-watcher.Events:
if !ok {
return
}
log.Println("event:", event)
if event.Op&fsnotify.Write == fsnotify.Write {
log.Println("modified file:", event.Name)
}
case err, ok := <-watcher.Errors:
if !ok {
return
}
log.Println("error:", err)
}
}
}()
err = watcher.Add("/tmp/foo")
if err != nil {
log.Fatal(err)
}
<-done
I'm not entirely sure what you're asking, but there's a subtle bug in the code you've provided.
A done channel is a common way to block until an action completes. It is used like this:
done := make(chan X) // Where X is any type
go func() {
// Some logic, possibly in a loop
close(done)
}()
// Other logic
<-done // Wait for `done` to be closed
The type of the channel is unimportant, as no data is (necissarily) sent over the channel, so bool works, but struct{} is more idiomatic, as it indicates that no data can be sent.
Your example almost does this, except that it never calls close(done). This is a bug. It means that the code will always wait forever at <-done, thus negating the entire purpose of a done channel. Your example code will never exit.
This means the code, as you have provided, could be also written as:
go func() {
// Do stuff
}()
// Do other stuff
<Any code that blocks forever>
Because there are countless ways to block forever--none of them ever useful in practice--the channel in your example is not needed.
As per my study, I found an answer from one guy in reddit.com. This is kind of trick though, using <-done makes the main goroutine waiting an any value from chan done, eventually this app keeps running for fsnotify to watch and send a event to the main goroutine.
I am searching a way to execute asynchronously two functions in go which returns different results and errors, wait for them to finish and print both results. Also if one of function returned error I do not want to wait for another function, and just print the error.
For example, I have this functions:
func methodInt(error bool) (int, error) {
<-time.NewTimer(time.Millisecond * 100).C
if error {
return 0, errors.New("Some error")
} else {
return 1, nil
}
}
func methodString(error bool) (string, error) {
<-time.NewTimer(time.Millisecond * 120).C
if error {
return "", errors.New("Some error")
} else {
return "Some result", nil
}
}
Here https://play.golang.org/p/-8StYapmlg is how I implemented it, but it has too much code I think. It can be simplified by using interface{} but I don't want to go this way. I want something simpler as, for example, can be implemented in C# with async/await. Probably there is some library that simplifies such operation.
UPDATE: Thank for your responses! It is awesome how fast I got help! I like the usage of WaitGroup. It obviously makes the code more robust to changes, so I easily can add another async method without changing exact count of methods in the end. However, there is still so much code in comparison to same in C#. I know that in go I don't need to explicitly mark methods as async, making them actually to return tasks, but methods call looks much more simple, for example, consider this link actually catching exception is also needed
By the way, I found that in my task I actually don't need to know returning type of the functions I want to run async because it will be anyway marshaled to json, and now I just call multiple services in the endpoint layer of go-kit.
You should create two channels for errors and results, then first read errors if no erorrs then read the results, this sample should works for your use case:
package main
import (
"errors"
"sync"
)
func test(i int) (int, error) {
if i > 2 {
return 0, errors.New("test error")
}
return i + 5, nil
}
func test2(i int) (int, error) {
if i > 3 {
return 0, errors.New("test2 error")
}
return i + 7, nil
}
func main() {
results := make(chan int, 2)
errors := make(chan error, 2)
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
result, err := test(3)
if err != nil {
errors <- err
return
}
results <- result
}()
wg.Add(1)
go func() {
defer wg.Done()
result, err := test2(3)
if err != nil {
errors <- err
return
}
results <- result
}()
// here we wait in other goroutine to all jobs done and close the channels
go func() {
wg.Wait()
close(results)
close(errors)
}()
for err := range errors {
// here error happend u could exit your caller function
println(err.Error())
return
}
for res := range results {
println("--------- ", res, " ------------")
}
}
I think here sync.WaitGroup can be used. It can waits for different and dynamic number of goroutines.
I have created a smaller, self-contained example of how you can have two go routines run asynchronously and wait for both to finish or quit the program if an error occurs (see below for an explanation):
package main
import (
"errors"
"fmt"
"math/rand"
"time"
)
func main() {
rand.Seed(time.Now().UnixNano())
// buffer the channel so the async go routines can exit right after sending
// their error
status := make(chan error, 2)
go func(c chan<- error) {
if rand.Intn(2) == 0 {
c <- errors.New("func 1 error")
} else {
fmt.Println("func 1 done")
c <- nil
}
}(status)
go func(c chan<- error) {
if rand.Intn(2) == 0 {
c <- errors.New("func 2 error")
} else {
fmt.Println("func 2 done")
c <- nil
}
}(status)
for i := 0; i < 2; i++ {
if err := <-status; err != nil {
fmt.Println("error encountered:", err)
break
}
}
}
What I do is create a channel that is used for synchronization of the two go routines. Writing to and reading from it blocks. The channel is used to pass the error value around, or nil if the function succeeds.
At the end I read one value per async go routine from the channel. This blocks until a value is received. If an error occurs, I exit the loop, thus quitting the program.
The functions either succeed or fail randomly.
I hope this gets you going on how to coordinate go routines, if not, let me know in the comments.
Note that if you run this in the Go Playground, the rand.Seed will do nothing, the playground always has the same "random" numbers, so the behavior will not change.
Why does a call to defer func() { recover() }() successfully recover a panicking goroutine, but a call to defer recover() not?
As an minimalistic example, this code doesn't panic
package main
func main() {
defer func() { recover() }()
panic("panic")
}
However, replacing the anonymous function with recover directly panics
package main
func main() {
defer recover()
panic("panic")
}
Quoting from the documentation of the built-in function recover():
If recover is called outside the deferred function it will not stop a panicking sequence.
In your second case recover() itself is the deferred function, and obviously recover() does not call itself. So this will not stop the panicking sequence.
If recover() would call recover() in itself, it would stop the panicking sequence (but why would it do that?).
Another Interesting Example:
The following code also doesn't panic (try it on the Go Playground):
package main
func main() {
var recover = func() { recover() }
defer recover()
panic("panic")
}
What happens here is we create a recover variable of function type which has a value of an anonymous function calling the built-in recover() function. And we specify calling the value of the recover variable to be the deferred function, so calling the builtin recover() from that stops the panicing sequence.
The Handling panic section mentions that
Two built-in functions, panic and recover, assist in reporting and handling run-time panics
The recover function allows a program to manage behavior of a panicking goroutine.
Suppose a function G defers a function D that calls recover and a panic occurs in a function on the same goroutine in which G is executing.
When the running of deferred functions reaches D, the return value of D's call to recover will be the value passed to the call of panic.
If D returns normally, without starting a new panic, the panicking sequence stops.
That illustrates that recover is meant to be called in a deferred function, not directly.
When it panic, the "deferred function" cannot be the built-in recover() one, but one specified in a defer statement.
DeferStmt = "defer" Expression .
The expression must be a function or method call; it cannot be parenthesized.
Calls of built-in functions are restricted as for expression statements.
With the exception of specific built-in functions, function and method calls and receive operations can appear in statement context.
An observation is that the real problem here is the design of defer and thus the answer should say that.
Motivating this answer, defer currently needs to take exactly one level of nested stack from a lambda, and the runtime uses a particular side effect of this constraint to make a determination on whether recover() returns nil or not.
Here's an example of this:
func b() {
defer func() { if recover() != nil { fmt.Printf("bad") } }()
}
func a() {
defer func() {
b()
if recover() != nil {
fmt.Printf("good")
}
}()
panic("error")
}
The recover() in b() should return nil.
In my opinion, a better choice would have been to say that defer takes a function BODY, or block scope (rather than a function call,) as its argument. At that point, panic and the recover() return value could be tied to a particular stack frame, and any inner stack frame would have a nil pancing context. Thus, it would look like this:
func b() {
defer { if recover() != nil { fmt.Printf("bad") } }
}
func a() {
defer {
b()
if recover() != nil {
fmt.Printf("good")
}
}
panic("error")
}
At this point, it's obvious that a() is in a panicking state, but b() is not, and any side effects like "being in the first stack frame of a deferred lambda" aren't necessary to correctly implement the runtime.
So, going against the grain here: The reason this doesn't work as might be expected, is a mistake in the design of the defer keyword in the go language, that was worked around using non-obvious implementation detail side effects and then codified as such.