I'm going over some existing code and see this repeated several times
defer mtx.Unlock()
mtx.Lock()
This looks wrong to me, I much prefer the idiomatic way of deferring the Unlock after performing the Lock but the documentation for Mutex.Lock doesn't specify a situation where the Lock will fail. Therefor the behaviour of the early defer pattern should be identical to the idiomatic way.
My question is: is there a compelling case to say this pattern is inferior? (e.g. the Lock may fail and then the deferred Unlock will panic) and thus the code should be changed or should I leave it as is?
Short answer:
Yes, it's OK. The defer call is made after the function returns (well, sort of).
The longer, more nuanced answer:
It's risky, and should be avoided. In your 2 line snippet, it's not going to be a problem, but consider the following code:
func (o *Obj) foo() error {
defer o.mu.Unlock()
if err := o.CheckSomething(); err != nil {
return err
}
o.mu.Lock()
// do stuff
}
In this case, the mutex may not be locked at all, or worse still: it may be locked on another routine, and you end up unlocking it. Meanwhile yet another routine obtains a lock that it really shouldn't have, and you either get data races, or by the time that routine returns, the unlock call will panic.
Debugging this kind of mess is a nightmare you can, and should avoid at all cost.
In addition to the code looking intuitive and being more error prone, depending on what you're trying to achieve, a defer does come at a cost. In most cases the cost is fairly marginal, but if you're dealing with something that is absolutely time critical, it's often better to manually add the unlock calls where needed. A good example where I'd not use defer is if you're caching stuff in a map[string]interface{}: I'd create a struct with the cached values and an sync.RWMutext field for concurrent use. If I use this cache a lot, the defer calls could start adding up. It can look a bit messy, but it's on a case by case basis. Either performance is what you aim for, or shorter, more readable code.
Other things to note about defers:
if you have multiple defers in a function, the order in which they are invoked is defined (LIFO).
Defers can alter the return values of the function you end up calling (if you use named returns)
Related
I am using Go race detection (the -race argument), and it detects some race conditions issues that I think should not be reported. I've created this sample code to explain my findings. Please do not comment about the goal of this example, as it has no goal other than to explain the issue.
This code:
var count int
func main() {
go update()
for {
fmt.Println(count)
time.Sleep(time.Second)
}
}
func update() {
for {
time.Sleep(time.Second)
count++
}
}
is reported with a race condition.
While this code:
var count int
var mutex sync.RWMutex
func main() {
go update()
for {
mutex.RLock()
fmt.Println(count)
mutex.RUnlock()
time.Sleep(time.Second)
}
}
func update(){
for {
time.Sleep(time.Second)
mutex.Lock()
count++
mutex.Unlock()
}
}
is not reported with any race condition issues.
My question is why?
There no bug in the first code.
The main function is reading a variable that another go routine is updating.
There is no potential hidden bug here.
The second code mutex does not provide any different behavior.
Where am I wrong here?
Your code contains a very clear race.
Your for loop is accessing count at the same time that the other goroutine is updating it. That's the definition of a race.
The main function is reading a variable that another go routine is updating.
Yes, exactly. That's what a race is.
The second code mutex does not provide any different behavior.
Yes, it does. It prevents the variable from being read and written at the same time from different goroutines.
You need to draw a distinction between a synchronization bug and a data race. A synchronization bug is a property of the code, whereas a data race is a property of a particular execution of the program. The latter is a manifestation of the former, but is in general not guaranteed to occur.
There no bug in the first code. The main function is reading a variable that another go routine is updating. There is no potential hidden bug here.
The race detector only detects data races, not synchronization bugs. It may miss some data races (false negatives), but it never reports false positives:
The race detector is a powerful tool for checking the correctness of concurrent programs. It will not issue false positives, so take its warnings seriously.
In other words, when the race detector reports a data race, you can be sure that your code contains at least one synchronization bug. You need to fix such bugs; otherwise, all bets are off.
Lo and behold, your first code snippet does indeed contain a synchronization bug: package-level variable count is accessed (by main) and updated (by update, started as a goroutine) concurrently without any synchronization. Here is a relevant passage of the Go Memory Model:
Programs that modify data being simultaneously accessed by multiple goroutines must serialize such access.
To serialize access, protect the data with channel operations or other synchronization primitives such as those in the sync and sync/atomic packages.
Using a reader/writer mutual-exclusion lock, as you did in your second snippet, fixes your synchronization bug.
The second code mutex does not provide any different behavior.
You just got lucky, when you executed the first program, that no data race occurred. In general, you have no guarantee.
This is off topic for Go (and the sample Go code won't trigger the problem even on x86 CPUs), but I have a demonstration proof, from roughly a decade ago at this point, that "torn reads" can produce inconsistent values even if the read and write operations are done with LOCK CMPXCHG8B, on some x86 CPUs (I think we were using early Haswell implementations).
The particular conditions that trigger this are a little difficult to set up. We had a custom allocator that had a bug: it only did four-byte alignment.1 We then had a "lock-free" (single locking instruction) algorithm to add entries to a queue, with single-writer multi-reader semantics.
It turns out that LOCK CMPXCHG8B instructions "work" on misaligned pointers as long as they do not cross page boundaries. As soon as they do, though, the readers can see a torn read, in which they get half the old value and half the new value, when a writer is doing an atomic write.
The result was an extremely difficult-to-track-down bug, where the system would run well for hours or even days before tripping over one of these. I finally diagnosed it by observing the data patterns, and eventually tracked the problem down to the allocator.
1Whether this is a bug depends on how one uses the allocated objects, but we were using them as 8-byte-wide pointers with LOCK CMPXCHG8B instructions.
I know that the Go idiomatic way to handle errors is that it's treated as a value which is checked using an if statement to see if it's nil or not.
However, it soon gets tedious in a long function where you would need to do this if err!=nil{...} in multiple places.
I am aware that error handling is one of the pain points in the Go community.
I was just thinking why can't we do something like this,
func Xyz(param1 map[string]interface{}, param2 context string) (return1 map[string]interface{}, err error)
{
defer func() {
if r := recover(); r != nil {
err = fmt.Errorf("error: %s\n", r)
}
}()
.....
.....
.....
// Code causes a panic
}
In your function have a deferred function call which makes use of recover so that if any panic occurs the call stack will start unwinding and the recover function will be invoked causing the program to not terminate, handle itself and return the error back to the caller.
Here is a Go Playground example,
https://play.golang.org/p/-bG-xEfSO-Q
My question is what is the downside of this approach? Is there anything that we lose by this?
I understand that the recover function only works for the same goroutine. Let's assume that this is on the same goroutine.
Can you? Yes.
This is in fact done in a few cases, even in the standard library. See encoding/json for examples.
But this should only ever be done within the confines of your private API. That is to say, you should not write an API, whose exposed behavior includes a possible panic-as-error-state. So ensure that you recover all panics, and convert them to errors (or otherwise handle them) before returning a value to the consumer of your API.
What is the downside of this approach? Is there anything that we lose by this?
A few things come to mind:
It clearly violates the principle of least astonishment, which is why you should absolutely never let this practice cross your public API boundary.
It's cumbersome, any time you need to act based on the error type/content, as it requires you to recover the error, convert it back to its original type (which may or may not be an error type), do your inspection/action, then possibly re-panic. It's much simpler to do all this with just a simple error value.
Closely related to #2, treating errors as values gives you a lot of fine-grained control over when and how to behave in response to an error condition. panic is a very blunt instrument in this regard, so you lose a lot of control.
panic + recover does not perform as well as simply returning an error. In situations where an error truly is exceptional, this may not matter (i.e. in the encoding/json example, it's used when a write fails, which will abort the entire operation, so being efficient is not of high importance).
I'm sure there are other reasons people can come up with. Google is full of blog posts on the topic. For your continued reading, here's one random example.
I was doing some debugging and had a bit of code like this:
go func() {
if !finished {
fmt.Println("Writing the data")
writer.Write(data)
}
}()
The finished variable is meant as a guard against writing to a writer that has been closed. However, it wasn't working. It appeared to be getting passed the flag. I determined that the call to Println was yielding the goroutine, which could allow the writer to be closed after checking the flag but before attempting the write. Sure enough, removing the call seems to have fixed it. However, I wanted to verify, and more importantly ask for suggestions on how to avoid this properly, rather than just avoiding prints in there.
Any I/O, yes, including fmt.Println, can cause a pause in goroutine execution.
But in practice, this shouldn't matter, as on any modern hardware, with more than one CPU core, you can experience a race even if the goroutine isn't paused.
You should always make your code concurrency safe.
I new to golang and I am reading the example from the book gopl.
Section 9.8.4 of The Go Programming Language book explains why Goroutines have no notion of identity that is accessible to the programmer
Goroutines have no notion of identity that is accessible to the programmer. This is by design, since thread-local storage tends to be abused. For example, in a web server implemented in a language with thread-local storage, it’s common for many functions to find information about the HTTP request on whose behalf they are currently working by looking in that storage. However, just as with programs that rely excessively on global variables, this can lead to an unhealthy ‘‘action at a distance’’ in which the behavior of a function is not determined by its arguments alone, but by the identity of the thread in which it runs. Consequently, if the identity of the thread should change—some worker threads are enlisted to help, say—the function misbehaves mysteriously.
and use the example of web server to illustrate this point. However, I have difficulty in understanding why the so called "action at a distance" is a bad practice and how this leads to
a function is not determined by its arguments alone, but by the identity of the thread in which it runs.
could anyone give an explanation for this(preferably in short code snippets)
Any help is appreciated!
Let's say we have the following code:
func doubler(num int) {
return num + num
}
doubler(5) will return 10. go doubler(5) will also return 10.
You can do the same with some sort of thread-local storage, if you want:
func doubler() {
return getThreadLocal("num") + getThreadLocal("num")
}
And we could run this with something like:
go func() {
setThreadLocal("num", 10)
doubler()
}()
But which is clearer? The variant which explicitly passes the num argument, or the variant which "magically" gets this from sort of thread-local storage?
This is what is meant with "action at a distance". The line setThreadLocal("num", 10) (which is distant) affects how doubler() behaves.
This example is clearly artificial, but the same principle applies with more real examples. For example, in some environments it's not uncommon to use thread-local store things such as user information, or other "global" variables.
This is why the paragraph you quoted compared it to global variables: thread-local storage are global variables, applicable only to the current thread.
When you passing parameters as arguments things are a lot clearer defined. There is no magic (often undocumented) global state that you need to think of when debugging things or writing tests.
See my repo
package main
import (
"fmt"
"time"
"github.com/timandy/routine"
)
func main() {
goid := routine.Goid()
fmt.Printf("cur goid: %v\n", goid)
go func() {
goid := routine.Goid()
fmt.Printf("sub goid: %v\n", goid)
}()
// Wait for the sub-coroutine to finish executing.
time.Sleep(time.Second)
}
I recommend looking at this post for an example of why someone might want to get information about the current thread/thread a function is running in:
stackoverflow - main threads in C#
As pointed out in the question, conditioning the behavior of a function on certain thread requirements (most likely) produces fragile/error-prone code that is difficult to debug.
I guess what your text book is trying to say is that a function should never rely on running in a specific thread, look up threads, etc. because this can cause unexpected behavior (especially in an API, if it's not obvious to the end-user that a function has to run in a specific thread). In Go, anything like that is impossible purely by language design. The behavior of a goroutine never depends on threads or something similar just because goroutines don't have an identity as you say correctly.
I want to know exactly what could happen when go maps accessed by multiple goroutins lets assume we have a map[int]*User. can modifying fields of User structure by multiple goroutins cause data corruption ? or just operations like len() are not thread safe what would be different if map was thread safe in Go ?
Concurrently modifying the *User could cause corruption regardless of the map. Reading the pointer from the map concurrently is safe, as long as there are no modifications to the map. Modifying the data *User points to makes no changes to the map itself.
Concurrently modifying the map[int]*User itself also risks data corruption.
There are no benign data races, always test your code with the race detector.
Simplest example;
go WorkerMethodOne(myMapReference)
go WorkerMethodTwo(myMapReference)
in worker method one I have some code like this (example)
for i := 0; i < len(MyMapReference); i++ {
if i % 2 == 0 {
delete(MyMapReference, i)
}
}
Then when WorkerMethodTwo is iterating that same map and tries to access the item that just got deleted, what happens? While a k, err := map[index] may still be safe, unlike in many languages where you'd throw, it doesn't make sense and is unpredictable. Ultimately worse examples could happen like attempts to concurrently write to the value of some *User. It could cause concurrent modification to the actual value (what's at the pointer), or you could have the pointer pulled our from under you and randomly be working with a value different than what you expected ect. It's really no different than if you made two closures run as goroutines and started modifying a non-atomic int without locking/using a mutex. You don't what's going to happen since there is contention for that memory between two fully decoupled executions.