Race condition even when using sync.Mutex in golang - go

Complete code is here: https://play.golang.org/p/ggUoxtcv5m
go run -race main.go says there is a race condition there which I fail to explain.
The program outputs correct final result, though.
The essence:
type SafeCounter struct {
c int
sync.Mutex
}
func (c *SafeCounter) Add() {
c.Lock()
c.c++
c.Unlock()
}
var counter *SafeCounter = &SafeCounter{} // global
use *SafeCounter in incrementor:
func incrementor(s string) {
for i := 0; i < 20; i++ {
x := counter
x.Add()
counter = x
}
}
The incrementor method is spawned twice in main:
func main() {
go incrementor()
go incrementor()
// some other non-really-related stuff like
// using waitGroup is ommited here for problem showcase
}
So, as I said, go run -race main.go will always say there is a race cond found.
Also, the final result is always correct (at least I've run this program for a number of times and it always say final counter is 40, which is correct).
BUT, the program prints incorrect values in the beginning so you can get something like:
Incrementor1: 0 Counter: 2
Incrementor2: 0 Counter: 3
Incrementor2: 1 Counter: 4
// ang the rest is ok
so, printing out 1 is missing there.
Can somebody explain why there is a race condition there is my code?

You have a number of race conditions, all pointed out specifically by the race detector:
x := counter // this reads the counter value without a lock
fmt.Println(&x.c)
x.Add()
counter = x // this writes the counter value without a lock
time.Sleep(time.Duration(rand.Intn(3)) * time.Millisecond)
fmt.Println(s, i, "Counter:", x.c) // this reads the c field without a lock
race #1 is between the read and the write of the counter value in incrementor
race #2 is between the concurrent writes to the counter value in incrementor
race #3 is between the read of the x.c field in fmt.Println, and the increment to x.c in the Add method.

The two lines that read and write the counter pointer are not protected by the mutex and are done concurrently from multiple goroutines.
func incrementor(s string) {
for i := 0; i < 20; i++ {
x := counter // <-- this pointer read
x.Add()
counter = x // <-- races with this pointer write
}
}

Related

Can WaitGroup used with normal functions

Can I use waitgroup with normal function and not with always goroutines
I have following type
type Manager struct {
....
wg sync.WaitGroup
}
func (m *Manager) create() {
m.wg.Add(1)
defer m.wg.Done()
....
....
}
func (m *Manager) close() {
m.wg.Wait()
}
It is working for me fine, I just want to know if this is correct
In the concurrent context, a waitgroup allows you to halt a goroutine until the group is "done". If you use the waitgroup incorrectly, you can have these erroneous outcomes:
Less Done than Add: The waitgroup never finishes, and the waiting goroutine halts forever (either panics if all goroutines are deadlocked, or a silent failure otherwise)
Less Add than Done: panics. See WaitGroup.Add
In the non-concurrent context, you won't receive any benefit from the synchronization of the waitgroup, so the only effect it can really have is demonstrating that the total amount added to the counter and the total amount taken away are equal (as is in correct use of a waitgroup). However, in the case of incorrect usage, it can result in a silent failure (see above), so you should not use it in this way.
There could be a legitimate use case where you want to increment / decrement a counter and then verify that it has been resolved to 0 at the end. In a non-concurrent context, you don't need such a fancy tool to do this: just use an int!
For example:
var counter int
// setup
// vv equivalent to wg.Add
counter += expectedNumberOfActions
for x := range actions {
// do something
// vv equivalent to wg.Done
counter--
}
// vv achieves the purpose of wg.Wait
if counter != 0 {
panic("oh no! the counter was not resolved correctly. there may be some bug in the implementation")
}

Counter that rotates that is safe for concurrent use

I want to write something like this:
type RoundRobinList struct {
lst []string
idx uint32
}
// rr_list is a RoundRobinList
func loop(rr_list) {
start = rr_list.idx
rr_list.idx = (rr_list.idx + 1)%len(rr_list.lst)
print(rr_list.lst[idx])
}
If rr_list.lst = ["a", "b", "c"], and loop is called over and over, I would expect the following printed:
"a"
"b"
"c"
"a"
"b"
"c" ...
Is this safe? rr_list.idx = (rr_list.idx + 1)%len(rr_list.lst)
Whenever you have a read and write of a value that's not protected in some way (using a mutex, or some of the other things in the sync package), you have a race, and that means you can't use it safely in a concurrent setting.
Here, you're reading and writing the idx field of your RoundRobinList structure without any protection, so you'll have a race if you use it concurrently.
As a first line of defense you should understand the rules for memory safety and follow them carefully, and not write code unless you're pretty sure it's safe. As a second line of defense, you can use the race detector to find a lot of problems with lack of safety of concurrent access.
Here's a simple test case, in file a_test.go. I had to also fix some bugs in the code to get it to compile. It starts two goroutines which call loop 1000 times each on a shared RoundRobinList.
package main
import (
"sync"
"testing"
)
type RoundRobinList struct {
lst []string
idx uint32
}
func loop(rr *RoundRobinList) {
rr.idx = (rr.idx + 1) % uint32(len(rr.lst))
}
func TestLoop(t *testing.T) {
rr := RoundRobinList{[]string{"a", "b", "c"}, 0}
var wg sync.WaitGroup
wg.Add(2)
for n := 0; n < 2; n++ {
go func() {
for i := 0; i < 1000; i++ {
loop(&rr)
}
wg.Done()
}()
}
wg.Wait()
}
Running with go test -race ./a_test.go results in:
==================
WARNING: DATA RACE
Read at 0x00c0000b0078 by goroutine 9:
command-line-arguments.loop()
/mnt/c/Users/paul/Desktop/a_test.go:14 +0xa1
command-line-arguments.TestLoop.func1()
/mnt/c/Users/paul/Desktop/a_test.go:24 +0x78
Previous write at 0x00c0000b0078 by goroutine 8:
command-line-arguments.loop()
/mnt/c/Users/paul/Desktop/a_test.go:14 +0x4e
command-line-arguments.TestLoop.func1()
/mnt/c/Users/paul/Desktop/a_test.go:24 +0x78
Goroutine 9 (running) created at:
command-line-arguments.TestLoop()
/mnt/c/Users/paul/Desktop/a_test.go:22 +0x1cc
testing.tRunner()
/usr/local/go/src/testing/testing.go:992 +0x1eb
Goroutine 8 (finished) created at:
command-line-arguments.TestLoop()
/mnt/c/Users/paul/Desktop/a_test.go:22 +0x1cc
testing.tRunner()
/usr/local/go/src/testing/testing.go:992 +0x1eb
==================
--- FAIL: TestLoop (0.00s)
testing.go:906: race detected during execution of test
FAIL
FAIL command-line-arguments 0.006s
FAIL

unclear on reasons why there is a race condition

The question concerns the following code:
package main
import "fmt"
func main() {
var counters = map[int]int{}
for i := 0; i < 5; i++ {
go func(counters map[int]int, th int) {
for j := 0; j < 5; j++ {
counters[th*10+j]++
}
}(counters, i)
}
fmt.Scanln()
fmt.Println("counters result", counters)
}
Here is the output I get when I run this code with go run -race race.go
$ go run -race race.go
==================
WARNING: DATA RACE
Read at 0x00c000092150 by goroutine 8:
runtime.mapaccess1_fast64()
/usr/lib/go-1.13/src/runtime/map_fast64.go:12 +0x0
main.main.func1()
/tmp/race.go:10 +0x6b
Previous write at 0x00c000092150 by goroutine 7:
runtime.mapassign_fast64()
/usr/lib/go-1.13/src/runtime/map_fast64.go:92 +0x0
main.main.func1()
/tmp/race.go:10 +0xaf
Goroutine 8 (running) created at:
main.main()
/tmp/race.go:8 +0x67
Goroutine 7 (finished) created at:
main.main()
/tmp/race.go:8 +0x67
==================
==================
WARNING: DATA RACE
Read at 0x00c0000aa188 by main goroutine:
reflect.typedmemmove()
/usr/lib/go-1.13/src/runtime/mbarrier.go:177 +0x0
reflect.copyVal()
/usr/lib/go-1.13/src/reflect/value.go:1297 +0x7b
reflect.(*MapIter).Value()
/usr/lib/go-1.13/src/reflect/value.go:1251 +0x15e
internal/fmtsort.Sort()
/usr/lib/go-1.13/src/internal/fmtsort/sort.go:61 +0x259
fmt.(*pp).printValue()
/usr/lib/go-1.13/src/fmt/print.go:773 +0x146f
fmt.(*pp).printArg()
/usr/lib/go-1.13/src/fmt/print.go:716 +0x2ee
fmt.(*pp).doPrintln()
/usr/lib/go-1.13/src/fmt/print.go:1173 +0xad
fmt.Fprintln()
/usr/lib/go-1.13/src/fmt/print.go:264 +0x65
main.main()
/usr/lib/go-1.13/src/fmt/print.go:274 +0x13c
Previous write at 0x00c0000aa188 by goroutine 10:
main.main.func1()
/tmp/race.go:10 +0xc4
Goroutine 10 (finished) created at:
main.main()
/tmp/race.go:8 +0x67
==================
counters result map[0:1 1:1 2:1 3:1 4:1 10:1 11:1 12:1 13:1 14:1 20:1 21:1 22:1 23:1 24:1 30:1 31:1 32:1 33:1 34:1 40:1 41:1 42:1 43:1 44:1]
Found 2 data race(s)
exit status 66
Here is what I can't understand. Why there a race condition at all? Aren't we reading/writing values only one go routine can access? For example routine 0 will modify values only in counter[0] through counters[4], routine 1 will modify values only in counters[10] through counters[14], routine 2 will only modify values in counters[20] through counters[24] and so on. I'm not seeing a race condition here. Feels like I'm missing something. Will someone be able to shed some light on this?
Just an FYI I'm new to go. If you could dumb down the explanation (if it is possible) I would appreciate it.
That would be true for an array (or a slice), but a map is a complicated data structure which, among others, have the following properties:
It's free to relocate the elements stored in it in memory at any time it sees fit.
A map is initially empty, and placing an element in it (what appears as assignment in your case) involves a lot of operations on the map's internals.
Additionally, in a case like yours — incrementing an integer stored in a map — is really a map lookup, increment, and a map store.
The first and the last operations involve lookup by key.
Now consider what happens if one goroutine performs lookup at the same time another goroutine modifies the map's internal state when performing map store.
You might want to read up a bit on what is an associative array, and how it's typically implemented.
Aren't we reading/writing values only one go routine can access?
You already got a great answer from #kostix on that matter: the internals of the map are modified when you add elements to it, so it's not accurate to think that routine 0 will modify values only in counter[0] through counters[4].
But that's not all.
There's yet another data race issue in your code that's a bit more subtle and might be very difficult to catch even in tests.
To explore it, let's get rid of the "map internals" issue that #kostix mentioned, by imagining that your code is almost exactly the same, but with one tiny change: instead of using a map[int]int, imagine that you're using a []int, initialized to have at least length 56. Something like this:
// THERE'S ANOTHER RACE CONDITION HERE.
// var counters = map[int]int{}
var counters = make([]int, 56)
for i := 0; i < 5; i++ {
// go func(counters map[int]int, th int) {
go func(counters []int, th int) {
for j := 0; j < 5; j++ {
counters[th*10+j]++
}
}(counters, i)
}
fmt.Scanln()
fmt.Println("counters result", counters)
This is nearly equivalent, but gets rid of the "map internals" issue. The goal is to shift the focus away from "map internals" to show you the second issue.
There's still a race condition there. By the way, it's also similar to a race condition that exists in the first attempted solution in another answer you got, that uses a sync.Mutex but in a way that is still wrong.
The problem here is that there's no happens before relationship between the operations that change the counters and the operation that reads from it.
The fmt.Scanln() doesn't help: even though it allows you to introduce an arbitrary time delay between the code right before it (i.e., when the for loop launches the goroutines) and the code right after it (i.e., the fmt.Println()) — so that you could think "Ok, I'm just gonna wait 'a reasonably long amount of time' before pressing Enter", that doesn't eliminate the race condition.
The race condition here arises from the fact that "passage of time" (i.e., you waiting to hit Enter) does not establish a happens-before relationship between the writes to counters and the reads from it.
This notion of happens-before is absolutely fundamental for avoiding data races: you can only guarantee the absence of a data race if you can guarantee the existence of a happens-before relationship between 2 operations.
Like I mentioned, "passage of time" doesn't establish a "happens before". To establish it, you could use one of many alternatives, including primitives in the sync or atomic packages, or channels, etc.
While I'd probably suggest focusing on studying channels, and then the sync package (sync.Mutex, sync.WaitGroup, etc), and maybe only after all that the atomic package, if you do want to read more about this idea of happens before from the authoritative source, here's the link: https://golang.org/ref/mem . But be warned that it's a nasty can of worms.
Hopefully these comments here help you see why it's absolutely fundamental to follow the standard patterns for concurrency in Go. Things can be way more subtle than at first sight.
And to conclude, a quote from The Go Memory Model link I shared above:
If you must read the rest of this document to understand the behavior of your program, you are being too clever.
Don't be clever.
EDIT: for completion, here's how you could solve the problem.
There are 2 parts to the solution: (1) make sure that there's no concurrent modifications to the map; (2) make sure that there's a happens-before between all the changes to the map and the read.
For (1), you can use a sync.Mutex. Lock it before writing, unlock it after the write.
For (2), you need to ensure that the main goroutine can only get to the fmt.Println() after all the modifications are done. And remember: here, after doesn't mean "at a later point in time", but it specifically means that a happens-before relationship must be established. The 2 common patterns to solve this are to use a channel or a sync.WaitGroup. The WaitGroup solution is probably easier to reason about here, so that's what I'd use.
var mu sync.Mutex // (A)
var wg sync.WaitGroup // (A)
var counters = map[int]int{}
wg.Add(5) // (B)
for i := 0; i < 5; i++ {
go func(counters map[int]int, th int) {
for j := 0; j < 5; j++ {
mu.Lock() // (C)
counters[th*10+j]++
mu.Unlock() // (C)
}
wg.Done() // (D)
}(counters, i)
}
wg.Wait() // (E)
fmt.Scanln()
fmt.Println("counters result", counters)
(A) You don't need to initialize either the Mutex nor the WaitGroup, since their zero values are ready to use. Also, you don't need to make them pointers to anything.
(B) You .Add(5) to the WaitGroup's counter, meaning that it will have to wait for 5 .Done() signals before proceeding if you .Wait() on it. The number 5 here is because you're launching 5 goroutines, and you need to establish happens-before relationships between the changes made on all of them and the main goroutine's fmt.Println().
(C) You .Lock() and .Unlock() the Mutex around modifications to the map, to ensure that they are not done concurrently.
(D) Just before each goroutine terminates, you call wg.Done(), which decrements the WaitGroup's internal counter.
(E) Finally, you wg.Wait(). This function blocks until the wg's counter reaches 0. And here's the super important piece: the WaitGroup establishes a happens-before relationship between the calls to wg.Done() and the return of the wg.Wait() call. In other words, from a memory consistency perspective, the main goroutine is guaranteed to see all the changes performed to the map by all the goroutines!
AND FINALLY you can run that code with -race and be happy!
For you to explore further: instead of map + sync.Mutex, you could replace that with just sync.Map. But the sync.WaitGroup would still be necessary. Try to write a solution using that, it might be a nice exercise.
In addition to #kostix answer. You've to know that multiple goroutines should not access (write/read) to the same ressource at a given time.
So, in your implementation you may easly be in the case that multiple goroutines are updating (reading/writing) concurrently the same ressource (which is your map) at the same time.
What should happen ? Which value should be in this given map key ? This a what called race condition
Here is some potential fixes to your code:
Using Mutex:
package main
import (
"fmt"
"sync"
)
func main() {
var counters = map[int]int{}
var mutex = &sync.Mutex{}
for i := 0; i < 3; i++ {
go func(counters map[int]int, th int) {
for j := 0; j < 3; j++ {
mutex.Lock() // Lock the access to the map
counters[th*10+j]++
mutex.Unlock() // Release the access
}
}(counters, i)
}
fmt.Scanln()
fmt.Println("counters result", counters)
}
Output:
counters result map[0:1 1:1 2:1 10:1 11:1 12:1 20:1 21:1 22:1]
Using sync.Map:
package main
import (
"fmt"
"sync"
)
func main() {
var counters sync.Map
for i := 0; i < 3; i++ {
go func(th int) {
for j := 0; j < 3; j++ {
if result, ok := counters.Load(th*10 + j); ok {
value := result.(int) + 1
counters.Store(th*10+j, value+1)
} else {
counters.Store(th*10+j, 1)
}
}
}(i)
}
fmt.Scanln()
counters.Range(func(k, v interface{}) bool {
fmt.Println("key:", k, ", value:", v)
return true
})
}
Output:
key: 21 , value: 1
key: 10 , value: 1
key: 11 , value: 1
key: 0 , value: 1
key: 1 , value: 1
key: 20 , value: 1
key: 2 , value: 1
key: 22 , value: 1
key: 12 , value: 1

Does a new run of for loop ends the scope of last run of for loop?

I do not understand why following program prints 0 1 2. I thought it will print 2 2 2.
package main
import (
"fmt"
)
func main() {
var funcs []func()
for i := 0; i < 3; i++ {
idx := i
funcs = append(funcs, func() { fmt.Println(idx) })
}
for _, f := range funcs {
f()
}
}
My reasoning of it should print 2 2 2 is that each run of the for loop shared the same scope(e.g., 2nd run of for loop does not terminate the scope of 1st run, the scope are shared). Thus idx's reference is shared by the anonymous function created with in each run of for loop. Thus when the loop ends, all 3 functions created shared the same reference of idx, whose value is 2.
So I think the question boils down to: Does a new run (e.g., i == 2) of for loop ends the scope of last run (e.g., i == 1) of for loop? Would appreciate if answer would point me to golang spec. (I could not find the spec mentioning this).
From spec https://golang.org/ref/spec#For_statements
A "for" statement specifies repeated execution of a block.
each of those blocks has its own scope and they are not nested or shared. But
Each "if", "for", and "switch" statement is considered to be in its
own implicit block.
so variable i in your snippet is shared and
for i := 0; i < 3; i++ {
funcs = append(funcs, func() { fmt.Println(i) })
}
for _, f := range funcs {
f()
}
will print 3 3 3 as expected.

unexpected behavior with loops and goroutinues in go

Why does this:
for i := 0; i < 3; i++ {
go func(i int) {
fmt.Printf("%d", i)
}(i)
}
prints 012
while this:
for i := 0; i < 3; i++ {
go func() {
fmt.Printf("%d", i)
}()
}
prints 333?
While goroutines are cheap, they aren't free. There is some, but little, overhead in creating them.
In your first program, the value of i is preserved into the goroutines because you're passing it in as an argument. (Each goroutine gets its own copy of i's value at that moment.)
In your second program, the value of i is already 3 before the first goroutine has started. Remember that goroutines share the same memory space in a Go program, so in this case, each goroutine is looking at the same i when it prints it out.
Adding a print statement just after your for loop should makes things clear for you. You’ll see that that print statement runs before your goroutine function.
When all you do in a for loop is to launch a new goroutine, your loop passes very fast and usually finishes even before your first goroutine starts. So when your goroutines start your loop is already finished and the value of if i is 3. Keep that in mind.
When you pass the i as a function argument, like you do in your first example, it’s current value is copied to the function stack, so the functions receives its current value as an argument. That’s why you’ll see 012. But when a closure function just uses a variable in its surrounding scope, like you do in your second example, it accesses its current value when it is run, which in your case is after the loop has finished and i has reached 3.
You can see this effect with this code:
for i := 0; i < 3; i++ {
go func(arg int) {
fmt.Printf("%d %d\n", arg, i)
}(i)
}
which produces this output:
0 3
1 3
2 3

Resources