Why does the race detector report a race condition, here? - go

I am using Go race detection (the -race argument), and it detects some race conditions issues that I think should not be reported. I've created this sample code to explain my findings. Please do not comment about the goal of this example, as it has no goal other than to explain the issue.
This code:
var count int
func main() {
go update()
for {
fmt.Println(count)
time.Sleep(time.Second)
}
}
func update() {
for {
time.Sleep(time.Second)
count++
}
}
is reported with a race condition.
While this code:
var count int
var mutex sync.RWMutex
func main() {
go update()
for {
mutex.RLock()
fmt.Println(count)
mutex.RUnlock()
time.Sleep(time.Second)
}
}
func update(){
for {
time.Sleep(time.Second)
mutex.Lock()
count++
mutex.Unlock()
}
}
is not reported with any race condition issues.
My question is why?
There no bug in the first code.
The main function is reading a variable that another go routine is updating.
There is no potential hidden bug here.
The second code mutex does not provide any different behavior.
Where am I wrong here?

Your code contains a very clear race.
Your for loop is accessing count at the same time that the other goroutine is updating it. That's the definition of a race.
The main function is reading a variable that another go routine is updating.
Yes, exactly. That's what a race is.
The second code mutex does not provide any different behavior.
Yes, it does. It prevents the variable from being read and written at the same time from different goroutines.

You need to draw a distinction between a synchronization bug and a data race. A synchronization bug is a property of the code, whereas a data race is a property of a particular execution of the program. The latter is a manifestation of the former, but is in general not guaranteed to occur.
There no bug in the first code. The main function is reading a variable that another go routine is updating. There is no potential hidden bug here.
The race detector only detects data races, not synchronization bugs. It may miss some data races (false negatives), but it never reports false positives:
The race detector is a powerful tool for checking the correctness of concurrent programs. It will not issue false positives, so take its warnings seriously.
In other words, when the race detector reports a data race, you can be sure that your code contains at least one synchronization bug. You need to fix such bugs; otherwise, all bets are off.
Lo and behold, your first code snippet does indeed contain a synchronization bug: package-level variable count is accessed (by main) and updated (by update, started as a goroutine) concurrently without any synchronization. Here is a relevant passage of the Go Memory Model:
Programs that modify data being simultaneously accessed by multiple goroutines must serialize such access.
To serialize access, protect the data with channel operations or other synchronization primitives such as those in the sync and sync/atomic packages.
Using a reader/writer mutual-exclusion lock, as you did in your second snippet, fixes your synchronization bug.
The second code mutex does not provide any different behavior.
You just got lucky, when you executed the first program, that no data race occurred. In general, you have no guarantee.

This is off topic for Go (and the sample Go code won't trigger the problem even on x86 CPUs), but I have a demonstration proof, from roughly a decade ago at this point, that "torn reads" can produce inconsistent values even if the read and write operations are done with LOCK CMPXCHG8B, on some x86 CPUs (I think we were using early Haswell implementations).
The particular conditions that trigger this are a little difficult to set up. We had a custom allocator that had a bug: it only did four-byte alignment.1 We then had a "lock-free" (single locking instruction) algorithm to add entries to a queue, with single-writer multi-reader semantics.
It turns out that LOCK CMPXCHG8B instructions "work" on misaligned pointers as long as they do not cross page boundaries. As soon as they do, though, the readers can see a torn read, in which they get half the old value and half the new value, when a writer is doing an atomic write.
The result was an extremely difficult-to-track-down bug, where the system would run well for hours or even days before tripping over one of these. I finally diagnosed it by observing the data patterns, and eventually tracked the problem down to the allocator.
1Whether this is a bug depends on how one uses the allocated objects, but we were using them as 8-byte-wide pointers with LOCK CMPXCHG8B instructions.

Related

When can golang compiler reorder commands and how sync primitives affects that?

I have read https://golang.org/ref/mem, but there are some parts which are still unclear to me.
For instance, in the section "Channel communication" it says: "The write to a happens before the send on c", but I don't know why that is the case. I am copying below the sample code extracted from the mentioned page for providing context.
var c = make(chan int, 10)
var a string
func f() {
a = "hello, world"
c <- 0
}
func main() {
go f()
<-c
print(a)
}
From the point of view of a single goroutine that assertion is true, however from the point of view of another goroutine that cannot be inferred from the guarantees that the text has mentioned so far.
So my question is: Are there other guarantees which are not explicitly stated in this document? For instance, can we say that given some sync primitive such as sending on a channel, ensures that commands placed before it, will not be moved after it by the compiler? What about the commands which comes after it, can we say that they will not be placed before the sync primitive?
What about the operations offered in the atomic package? Do they provide the same guarantees as channels operations?
can we say that given some sync primitive such as sending on a channel, ensures that commands placed before it, will not be moved after it by the compiler?
That is exactly what the memory model says. When you look at a single goroutine, the execution order can be rearranged so that effects of write operations are visible in the order they appear in the execution. So if you set a=1 at some point and read a later, the compiler knows not to move the write operation ahead of the read. For multiple goroutines, channels and locks are the synchronization points, so anything that happened before a channel/lock operation is visible to other goroutines once the synchronization point is reached. The compiler will not move code around so that a write operation crosses the synchronization boundary.
There are guarantees satisfied by the sync/atomic operations as well, and there have been discussions on whether to add them to the memory model. They are not explicitly stated at the moment. There is an open issue about it:
https://github.com/golang/go/issues/5045

What is the use-case for TryEnterCriticalSection?

I've been using Windows CRITICAL_SECTION since the 1990s and I've been aware of the TryEnterCriticalSection function since it first appeared. I understand that it's supposed to help me avoid a context switch and all that.
But it just occurred to me that I have never used it. Not once.
Nor have I ever felt I needed to use it. In fact, I can't think of a situation in which I would.
Generally when I need to get an exclusive lock on something, I need that lock and I need it now. I can't put it off until later. I certainly can't just say, "oh well, I won't update that data after all". So I need EnterCriticalSection, not TryEnterCriticalSection
So what exactly is the use case for TryEnterCriticalSection?
I've Googled this, of course. I've found plenty of quick descriptions on how to use it but almost no real-world examples of why. I did find this example from Intel that, frankly doesn't help much:
CRITICAL_SECTION cs;
void threadfoo()
{
while(TryEnterCriticalSection(&cs) == FALSE)
{
// some useful work
}
// Critical Section of Code
LeaveCriticalSection (&cs);
}
// other work
}
What exactly is a scenario in which I can do "some useful work" while I'm waiting for my lock? I'd love to avoid thread-contention but in my code, by the time I need the critical section, I've already been forced to do all that "useful work" in order to get the values that I'm updating in shared data (for which I need the critical section in the first place).
Does anyone have a real-world example?
As an example you might have multiple threads that each produce a high volume of messages (events of some sort) that all need to go on a shared queue.
Since there's going to be frequent contention on the lock on the shared queue, each thread can have a local queue and then, whenever the TryEnterCriticalSection call succeeds for the current thread, it copies everything it has in its local queue to the shared one and releases the CS again.
In C++11 therestd::lock which employs deadlock-avoidance algorithm.
In C++17 this has been elaborated to std::scoped_lock class.
This algorithm tries to lock on mutexes in one order, and then in another, until succeeds. It takes try_lock to implement this approach.
Having try_lock method in C++ is called Lockable named requirement, whereas mutexes with only lock and unlock are BasicLockable.
So if you build C++ mutex on top of CTRITICAL_SECTION, and you want to implement Lockable, or you'll want to implement lock avoidance directly on CRITICAL_SECTION, you'll need TryEnterCriticalSection
Additionally you can implement timed mutex on TryEnterCriticalSection. You can do few iterations of TryEnterCriticalSection, then call Sleep with increasing delay time, until TryEnterCriticalSection succeeds or deadline has expired. It is not a very good idea though. Really timed mutexes based on user-space WIndows synchronization objects are implemented on SleepConditionVariableSRW, SleepConditionVariableCS or WaitOnAddress.
Because windows CS are recursive TryEnterCriticalSection allows a thread to check whether it already owns a CS without risk of stalling.
Another case would be if you have a thread that occasionally needs to perform some locked work but usually does something else, you could use TryEnterCriticalSection and only perform the locked work if you actually got the lock.

Is it OK to defer an Unlock before a Lock

I'm going over some existing code and see this repeated several times
defer mtx.Unlock()
mtx.Lock()
This looks wrong to me, I much prefer the idiomatic way of deferring the Unlock after performing the Lock but the documentation for Mutex.Lock doesn't specify a situation where the Lock will fail. Therefor the behaviour of the early defer pattern should be identical to the idiomatic way.
My question is: is there a compelling case to say this pattern is inferior? (e.g. the Lock may fail and then the deferred Unlock will panic) and thus the code should be changed or should I leave it as is?
Short answer:
Yes, it's OK. The defer call is made after the function returns (well, sort of).
The longer, more nuanced answer:
It's risky, and should be avoided. In your 2 line snippet, it's not going to be a problem, but consider the following code:
func (o *Obj) foo() error {
defer o.mu.Unlock()
if err := o.CheckSomething(); err != nil {
return err
}
o.mu.Lock()
// do stuff
}
In this case, the mutex may not be locked at all, or worse still: it may be locked on another routine, and you end up unlocking it. Meanwhile yet another routine obtains a lock that it really shouldn't have, and you either get data races, or by the time that routine returns, the unlock call will panic.
Debugging this kind of mess is a nightmare you can, and should avoid at all cost.
In addition to the code looking intuitive and being more error prone, depending on what you're trying to achieve, a defer does come at a cost. In most cases the cost is fairly marginal, but if you're dealing with something that is absolutely time critical, it's often better to manually add the unlock calls where needed. A good example where I'd not use defer is if you're caching stuff in a map[string]interface{}: I'd create a struct with the cached values and an sync.RWMutext field for concurrent use. If I use this cache a lot, the defer calls could start adding up. It can look a bit messy, but it's on a case by case basis. Either performance is what you aim for, or shorter, more readable code.
Other things to note about defers:
if you have multiple defers in a function, the order in which they are invoked is defined (LIFO).
Defers can alter the return values of the function you end up calling (if you use named returns)

Is sync.Map atomic? Mainly I mean Load, Store, LoadOrStore, Delete

As tile, I am referring to Go package sync.Map, can its functions be considered as atomic? Mainly the Load, Store, LoadOrStore, and Delete function.
I also build a simple example go playground, is it guaranteed that only one goroutine can enter the code range line 15 - 17? As my test seems it can be guaranteed.
Please help to explain.
The godoc page for the sync package says: "Map is like a Go map[interface{}]interface{} but is safe for concurrent use by multiple goroutines without additional locking or coordination."
This statement guarantees that there's no need for additional mutexes or synchronization across goroutines. I wouldn't call that claim "atomic" (which has a very precise meaning), but it does mean that you don't have to worry about multiple goroutines being able to enter a LoadOrStore block (with the same key) like in your example.

what does not being thread safe means about maps in Go?

I want to know exactly what could happen when go maps accessed by multiple goroutins lets assume we have a map[int]*User. can modifying fields of User structure by multiple goroutins cause data corruption ? or just operations like len() are not thread safe what would be different if map was thread safe in Go ?
Concurrently modifying the *User could cause corruption regardless of the map. Reading the pointer from the map concurrently is safe, as long as there are no modifications to the map. Modifying the data *User points to makes no changes to the map itself.
Concurrently modifying the map[int]*User itself also risks data corruption.
There are no benign data races, always test your code with the race detector.
Simplest example;
go WorkerMethodOne(myMapReference)
go WorkerMethodTwo(myMapReference)
in worker method one I have some code like this (example)
for i := 0; i < len(MyMapReference); i++ {
if i % 2 == 0 {
delete(MyMapReference, i)
}
}
Then when WorkerMethodTwo is iterating that same map and tries to access the item that just got deleted, what happens? While a k, err := map[index] may still be safe, unlike in many languages where you'd throw, it doesn't make sense and is unpredictable. Ultimately worse examples could happen like attempts to concurrently write to the value of some *User. It could cause concurrent modification to the actual value (what's at the pointer), or you could have the pointer pulled our from under you and randomly be working with a value different than what you expected ect. It's really no different than if you made two closures run as goroutines and started modifying a non-atomic int without locking/using a mutex. You don't what's going to happen since there is contention for that memory between two fully decoupled executions.

Resources