I was doing some debugging and had a bit of code like this:
go func() {
if !finished {
fmt.Println("Writing the data")
writer.Write(data)
}
}()
The finished variable is meant as a guard against writing to a writer that has been closed. However, it wasn't working. It appeared to be getting passed the flag. I determined that the call to Println was yielding the goroutine, which could allow the writer to be closed after checking the flag but before attempting the write. Sure enough, removing the call seems to have fixed it. However, I wanted to verify, and more importantly ask for suggestions on how to avoid this properly, rather than just avoiding prints in there.
Any I/O, yes, including fmt.Println, can cause a pause in goroutine execution.
But in practice, this shouldn't matter, as on any modern hardware, with more than one CPU core, you can experience a race even if the goroutine isn't paused.
You should always make your code concurrency safe.
Related
I know you can define functions called init in any package, and these function will be executed before main. I use this to open my log file and my DB connection.
Is there a way to define code that will be executed when the program ends, either because it reaches the end of the main function or because it was interrupted ? The only way I can think of is by manually calling a deffered terminate function on each package used by main, but that's quite verbose and error prone.
The C atexit functionality was considered by the Go developers and the idea of adopting it was rejected.
From one of the related thread at golang-nuts:
Russ Cox:
Atexit may make sense in single-threaded, short-lived
programs, but I am skeptical that it has a place in a
long-running multi-threaded server.
I've seen many C++ programs that hang on exit because
they're running global destructors that don't really need to
run, and those destructors are cleaning up and freeing
memory that would be reclaimed by the operating system
anyway, if only the program could get to the exit system call.
Compared to all that pain, needing to call Flush when you're
one with a buffer seems entirely reasonable and is
necessary anyway for correct execution of long-running
programs.
Even ignoring that problem, atexit introduces even more
threads of control, and you have to answer questions like
do all the other goroutines stop before the atexit handlers
run? If not, how do they avoid interfering? If so, what if
one holds a lock that the handler needs? And on and on.
I'm not at all inclined to add Atexit.
Ian Lance Taylor:
The only fully reliable mechanism is a wrapper program that invokes the
real program and does the cleanup when the real program completes. That
is true in any language, not just Go.
In my somewhat unformed opinion, os.AtExit is not a great idea. It is
an unstructured facility that causes stuff to happen at program exit
time in an unpredictable order. It leads to weird scenarios like
programs that take a long time just to exit, an operation that should be
very fast. It also leads to weird functions like the C function _exit,
which more or less means exit-but-don't-run-atexit-functions.
That said, I think a special exit function corresponding to the init
function is an interesting idea. It would have the structure that
os.AtExit lacks (namely, exit functions are run in reverse order of when
init functions are run).
But exit functions won't help you if your program gets killed by the
kernel, or crashes because you call some C code that gets a segmentation
violation.
In general, I agree with jnml's answer. Should you however still want to do it, you could use defer in the main() function, like this: http://play.golang.org/p/aUdFXHtFOM.
I am using Go race detection (the -race argument), and it detects some race conditions issues that I think should not be reported. I've created this sample code to explain my findings. Please do not comment about the goal of this example, as it has no goal other than to explain the issue.
This code:
var count int
func main() {
go update()
for {
fmt.Println(count)
time.Sleep(time.Second)
}
}
func update() {
for {
time.Sleep(time.Second)
count++
}
}
is reported with a race condition.
While this code:
var count int
var mutex sync.RWMutex
func main() {
go update()
for {
mutex.RLock()
fmt.Println(count)
mutex.RUnlock()
time.Sleep(time.Second)
}
}
func update(){
for {
time.Sleep(time.Second)
mutex.Lock()
count++
mutex.Unlock()
}
}
is not reported with any race condition issues.
My question is why?
There no bug in the first code.
The main function is reading a variable that another go routine is updating.
There is no potential hidden bug here.
The second code mutex does not provide any different behavior.
Where am I wrong here?
Your code contains a very clear race.
Your for loop is accessing count at the same time that the other goroutine is updating it. That's the definition of a race.
The main function is reading a variable that another go routine is updating.
Yes, exactly. That's what a race is.
The second code mutex does not provide any different behavior.
Yes, it does. It prevents the variable from being read and written at the same time from different goroutines.
You need to draw a distinction between a synchronization bug and a data race. A synchronization bug is a property of the code, whereas a data race is a property of a particular execution of the program. The latter is a manifestation of the former, but is in general not guaranteed to occur.
There no bug in the first code. The main function is reading a variable that another go routine is updating. There is no potential hidden bug here.
The race detector only detects data races, not synchronization bugs. It may miss some data races (false negatives), but it never reports false positives:
The race detector is a powerful tool for checking the correctness of concurrent programs. It will not issue false positives, so take its warnings seriously.
In other words, when the race detector reports a data race, you can be sure that your code contains at least one synchronization bug. You need to fix such bugs; otherwise, all bets are off.
Lo and behold, your first code snippet does indeed contain a synchronization bug: package-level variable count is accessed (by main) and updated (by update, started as a goroutine) concurrently without any synchronization. Here is a relevant passage of the Go Memory Model:
Programs that modify data being simultaneously accessed by multiple goroutines must serialize such access.
To serialize access, protect the data with channel operations or other synchronization primitives such as those in the sync and sync/atomic packages.
Using a reader/writer mutual-exclusion lock, as you did in your second snippet, fixes your synchronization bug.
The second code mutex does not provide any different behavior.
You just got lucky, when you executed the first program, that no data race occurred. In general, you have no guarantee.
This is off topic for Go (and the sample Go code won't trigger the problem even on x86 CPUs), but I have a demonstration proof, from roughly a decade ago at this point, that "torn reads" can produce inconsistent values even if the read and write operations are done with LOCK CMPXCHG8B, on some x86 CPUs (I think we were using early Haswell implementations).
The particular conditions that trigger this are a little difficult to set up. We had a custom allocator that had a bug: it only did four-byte alignment.1 We then had a "lock-free" (single locking instruction) algorithm to add entries to a queue, with single-writer multi-reader semantics.
It turns out that LOCK CMPXCHG8B instructions "work" on misaligned pointers as long as they do not cross page boundaries. As soon as they do, though, the readers can see a torn read, in which they get half the old value and half the new value, when a writer is doing an atomic write.
The result was an extremely difficult-to-track-down bug, where the system would run well for hours or even days before tripping over one of these. I finally diagnosed it by observing the data patterns, and eventually tracked the problem down to the allocator.
1Whether this is a bug depends on how one uses the allocated objects, but we were using them as 8-byte-wide pointers with LOCK CMPXCHG8B instructions.
I'm going over some existing code and see this repeated several times
defer mtx.Unlock()
mtx.Lock()
This looks wrong to me, I much prefer the idiomatic way of deferring the Unlock after performing the Lock but the documentation for Mutex.Lock doesn't specify a situation where the Lock will fail. Therefor the behaviour of the early defer pattern should be identical to the idiomatic way.
My question is: is there a compelling case to say this pattern is inferior? (e.g. the Lock may fail and then the deferred Unlock will panic) and thus the code should be changed or should I leave it as is?
Short answer:
Yes, it's OK. The defer call is made after the function returns (well, sort of).
The longer, more nuanced answer:
It's risky, and should be avoided. In your 2 line snippet, it's not going to be a problem, but consider the following code:
func (o *Obj) foo() error {
defer o.mu.Unlock()
if err := o.CheckSomething(); err != nil {
return err
}
o.mu.Lock()
// do stuff
}
In this case, the mutex may not be locked at all, or worse still: it may be locked on another routine, and you end up unlocking it. Meanwhile yet another routine obtains a lock that it really shouldn't have, and you either get data races, or by the time that routine returns, the unlock call will panic.
Debugging this kind of mess is a nightmare you can, and should avoid at all cost.
In addition to the code looking intuitive and being more error prone, depending on what you're trying to achieve, a defer does come at a cost. In most cases the cost is fairly marginal, but if you're dealing with something that is absolutely time critical, it's often better to manually add the unlock calls where needed. A good example where I'd not use defer is if you're caching stuff in a map[string]interface{}: I'd create a struct with the cached values and an sync.RWMutext field for concurrent use. If I use this cache a lot, the defer calls could start adding up. It can look a bit messy, but it's on a case by case basis. Either performance is what you aim for, or shorter, more readable code.
Other things to note about defers:
if you have multiple defers in a function, the order in which they are invoked is defined (LIFO).
Defers can alter the return values of the function you end up calling (if you use named returns)
I'm currently reading the slices of Go Concurrency Patterns. I'm a little bit confused about a seeming contradiction between a statement on slide #16:
When main returns, the program exits and takes the boring function down with it.
and another one on slide #19 (in combination with the example on slide #20):
A channel in Go provides a connection between two goroutines, allowing them to communicate.
If main is just a goroutine, how can it cause any another (spawned) goroutine to stop, in other words: in what sense is the goroutine named main special?*
* I searched for it, but found nothing obviously enlightening so far; the SO question with the promising title Difference between the main goroutine and spawned goroutines of a Go program asks for a completely different issue.
edit: changed the title, to focus on the difference between main and "normal" goroutines (after stumbling upon the Go runtime function Goexit)
edit: simplified question, to be even more focused on the specifics of main
I think you need to consider the goroutine implications separately to the process implications.
The main() function is a goroutine (or if you want to be really picky, called from an implicitly created goroutine). Using go creates other goroutines. Returning from main() terminates its goroutine but also terminates the process as a whole (and thus all other goroutines). It is also possible to terminate the process as a whole by calling os.Exit() or similar from any goroutine.
In Golang a panic without a recover will crash the process, so I end up putting the following code snippet at the beginning of every function:
defer func() {
if err := recover(); err != nil {
fmt.Println(err)
}
}()
just in order to prevent my program from crashing. Now I'm wondering, is it really the way to go? Because I think it looks a little bit strange to put the same code everywhere.
It seems to me, the Java way, bubbling the exceptions up to the calling function, until the main function is a better way to control the exceptions/panics. I understand it's by Go's design, but what is the advantage of immediately crashing the process just like what Go does?
You should only recover from a panic if you know exactly why. A Go program will panic under essentially two circumstances:
A program logic error (such as a nil pointer dereference or out-of-bounds array or slice access)
An intentional panic (called using panic(...)) from either your code or code that your code calls
In the first case, a crash is appropriate because it means that your program has entered a bad state and shouldn't keep executing. In the second case, you should only recover from the panic if you expect it. The best way to explain this is simply to say that it's extremely rare, and you'll know that case if you see it. I'm almost positive that whatever code you're writing, you don't need to recover from panics.
Generally, even with exceptions, you catch them at a "FaultBarrier". It's usually the place where all new threads are spawned. The point is to catch and log unexpected failures.
In Go, you use return values for all expected failures. The framework in which you work will generally have a fault barrier to catch a session (ie: usually an http transaction) and log the problem. The only other place I see recover happening is things like non-idempotent Close function. If you have a situation where you can't tell if something is already closed but know it must be closed, then you could end up doing a recover so that a second close panic will be ignored, rather than failing what you are doing all the way up to the FaultBarrier.
I think panic is not same as exception. You can handle exception and running of routine will be continued. Whereas panic causes to termination current routine and you can not skip it.
Exception emits by OS generally and causes to run related routine. Instead panic emits by programmer manually and causes to exit goroutine.
You can define multiple exception for every piece of code within a function. Panic recovery mechanism works for whole function.
Exception designed to be handled whereas Panic designed to termination and panic recovering mechanism seems to be just a trick to control termination.
So exception handling is comparable with error handling.
But how you can take advantages of it in Golang?
I will describe my use case to answer your question.
There is two type of blocking error, Panic and Fatal. You can not recover from fatal error.
Sometimes you need to kill the process. But sometimes you need to restart it.
I used recover() mechanism to recovering from panic error in order to shutting down current goroutine and restarting main functionality.
So I must be careful about error type. Some situations need to emit fatal error such as missing necessary config file. And sometimes it is reasonable to restart the app. Think about all situations that you like to restart the app after crash. Such as:
overloaded service (which causes to DoS)
missing DBMS
unexpected error origin from other used go packages
crashing imagick process for example
and so on
So,
recover() is very beneficial in my case. It gives a chance to shutdown the process clearly before exiting.
Side Note: You can develop an bootstrapper app which will runs main app as detached process. It must re-run main app, if that process exited abnormally.
Use logging in order to debug while keeping your app run.