Why race condition with goroutine won't happen some time? - go

I'm reading go-in-action. This example is from chapter6/listing09.go.
// This sample program demonstrates how to create race
// conditions in our programs. We don't want to do this.
package main
import (
"fmt"
"runtime"
"sync"
)
var (
// counter is a variable incremented by all goroutines.
counter int
// wg is used to wait for the program to finish.
wg sync.WaitGroup
)
// main is the entry point for all Go programs.
func main() {
// Add a count of two, one for each goroutine.
wg.Add(2)
// Create two goroutines.
go incCounter(1)
go incCounter(2)
// Wait for the goroutines to finish.
wg.Wait()
fmt.Println("Final Counter:", counter)
}
// incCounter increments the package level counter variable.
func incCounter(id int) {
// Schedule the call to Done to tell main we are done.
defer wg.Done()
for count := 0; count < 2; count++ {
// Capture the value of Counter.
value := counter
// Yield the thread and be placed back in queue.
runtime.Gosched()
// Increment our local value of Counter.
value++
// Store the value back into Counter.
counter = value
}
}
If you run this code in play.golang.org, it will be 2, same as the book.
But my mac print 4 most of the time, some time 2, some time even 3.
$ go run listing09.go
Final Counter: 2
$ go run listing09.go
Final Counter: 4
$ go run listing09.go
Final Counter: 4
$ go run listing09.go
Final Counter: 4
$ go run listing09.go
Final Counter: 4
$ go run listing09.go
Final Counter: 2
$ go run listing09.go
Final Counter: 4
$ go run listing09.go
Final Counter: 2
$ go run listing09.go
Final Counter: 3
sysinfo
go version go1.8.1 darwin/amd64
macOS sierra
Macbook Pro
Explanation from the book(p140)
Each goroutine overwrites the work of the other. This happens when the goroutine swap is taking place. Each goroutine makes its own copy of the counter variable and then is swapped out for the other goroutine. When the goroutine is given time to exe- cute again, the value of the counter variable has changed, but the goroutine doesn’t update its copy. Instead it continues to increment the copy it has and set the value back to the counter variable, replacing the work the other goroutine performed.
According to this explanation, this code should always print 2.
Why I got 4 and 3? Is it because race condition didn't happen?
Why go playground always get 2?
update:
After I set runtime.GOMAXPROCS(1), it starts to print 2, no 4, some 3.
I guess the play.golang.org is configured to have one logical processor.
The right result 4 without race condition. One logical processor means one thread. GO has same logical processors as the physical cores by default.So,
why one thread(one logical processor) leads to race condition while multiple thread print the right answer?
Can we say the explanation from the book is wrong since we also get 3 and 4 ?
How it get 3 ? 4 is correct.

Race conditions are, by definition, nondeterministic. This means that while you may get a particular answer most of the time, it will not always be so.
By running racy code on multiple cores you greatly increases the number of possibilities, hence you get a broader selection of results.
See this post or this Wikipedia article for more information on race conditions.

Related

Doesn't go routine and the channels work in order of call?

Doesn't go routine and the channels worked in the order they were called.
and go routine share values between the region variables?
main.go
var dataSendChannel = make(chan int)
func main() {
a(dataSendChannel)
time.Sleep(time.Second * 10)
}
func a(c chan<- int) {
for i := 0; i < 1000; i++ {
go b(dataSendChannel)
c <- i
}
}
func b(c <-chan int) {
val := <-c
fmt.Println(val)
}
output
> go run main.go
0
1
54
3
61
5
6
7
8
9
Channels are ordered. Goroutines are not. Goroutines may run, or stall, more or less at random, whenever they are logically allowed to run. They must stop and wait whenever you force them to do so, e.g., by attempting to write on a full channel, or using a mutex.Lock() call on an already-locked mutex, or any of those sorts of things.
Your dataSendChannel is unbuffered, so an attempt to write to it will pause until some goroutine is actively attempting to read from it. Function a spins off one goroutine that will attempt one read (go b(...)), then writes and therefore waits for at least one reader to be reading. Function b immediately begins reading, waiting for data. This unblocks function a, which can now write some integer value. Function a can now spin off another instance of b, which begins reading; this may happen before, during, or after the b that got a value begins calling fmt.Println. This second instance of b must now wait for someone—which in this case is always function a, running the loop—to send another value, but a does that as quickly as it can. The second instance of b can now begin calling fmt.Println, but it might, mostly-randomly, not get a chance to do that yet. The first instance of b might already be in fmt.Println, or maybe it isn't yet, and the second one might run first—or maybe both wait around for a while and a third instance of b spins up, reads some value from the channel, and so on.
There's no guarantee which instance of b actually gets into fmt.Println when, so the values you see printed will come out in some semi-random order. If you want the various b instances to sequence themselves, they will need to do that somehow.

unclear on reasons why there is a race condition

The question concerns the following code:
package main
import "fmt"
func main() {
var counters = map[int]int{}
for i := 0; i < 5; i++ {
go func(counters map[int]int, th int) {
for j := 0; j < 5; j++ {
counters[th*10+j]++
}
}(counters, i)
}
fmt.Scanln()
fmt.Println("counters result", counters)
}
Here is the output I get when I run this code with go run -race race.go
$ go run -race race.go
==================
WARNING: DATA RACE
Read at 0x00c000092150 by goroutine 8:
runtime.mapaccess1_fast64()
/usr/lib/go-1.13/src/runtime/map_fast64.go:12 +0x0
main.main.func1()
/tmp/race.go:10 +0x6b
Previous write at 0x00c000092150 by goroutine 7:
runtime.mapassign_fast64()
/usr/lib/go-1.13/src/runtime/map_fast64.go:92 +0x0
main.main.func1()
/tmp/race.go:10 +0xaf
Goroutine 8 (running) created at:
main.main()
/tmp/race.go:8 +0x67
Goroutine 7 (finished) created at:
main.main()
/tmp/race.go:8 +0x67
==================
==================
WARNING: DATA RACE
Read at 0x00c0000aa188 by main goroutine:
reflect.typedmemmove()
/usr/lib/go-1.13/src/runtime/mbarrier.go:177 +0x0
reflect.copyVal()
/usr/lib/go-1.13/src/reflect/value.go:1297 +0x7b
reflect.(*MapIter).Value()
/usr/lib/go-1.13/src/reflect/value.go:1251 +0x15e
internal/fmtsort.Sort()
/usr/lib/go-1.13/src/internal/fmtsort/sort.go:61 +0x259
fmt.(*pp).printValue()
/usr/lib/go-1.13/src/fmt/print.go:773 +0x146f
fmt.(*pp).printArg()
/usr/lib/go-1.13/src/fmt/print.go:716 +0x2ee
fmt.(*pp).doPrintln()
/usr/lib/go-1.13/src/fmt/print.go:1173 +0xad
fmt.Fprintln()
/usr/lib/go-1.13/src/fmt/print.go:264 +0x65
main.main()
/usr/lib/go-1.13/src/fmt/print.go:274 +0x13c
Previous write at 0x00c0000aa188 by goroutine 10:
main.main.func1()
/tmp/race.go:10 +0xc4
Goroutine 10 (finished) created at:
main.main()
/tmp/race.go:8 +0x67
==================
counters result map[0:1 1:1 2:1 3:1 4:1 10:1 11:1 12:1 13:1 14:1 20:1 21:1 22:1 23:1 24:1 30:1 31:1 32:1 33:1 34:1 40:1 41:1 42:1 43:1 44:1]
Found 2 data race(s)
exit status 66
Here is what I can't understand. Why there a race condition at all? Aren't we reading/writing values only one go routine can access? For example routine 0 will modify values only in counter[0] through counters[4], routine 1 will modify values only in counters[10] through counters[14], routine 2 will only modify values in counters[20] through counters[24] and so on. I'm not seeing a race condition here. Feels like I'm missing something. Will someone be able to shed some light on this?
Just an FYI I'm new to go. If you could dumb down the explanation (if it is possible) I would appreciate it.
That would be true for an array (or a slice), but a map is a complicated data structure which, among others, have the following properties:
It's free to relocate the elements stored in it in memory at any time it sees fit.
A map is initially empty, and placing an element in it (what appears as assignment in your case) involves a lot of operations on the map's internals.
Additionally, in a case like yours — incrementing an integer stored in a map — is really a map lookup, increment, and a map store.
The first and the last operations involve lookup by key.
Now consider what happens if one goroutine performs lookup at the same time another goroutine modifies the map's internal state when performing map store.
You might want to read up a bit on what is an associative array, and how it's typically implemented.
Aren't we reading/writing values only one go routine can access?
You already got a great answer from #kostix on that matter: the internals of the map are modified when you add elements to it, so it's not accurate to think that routine 0 will modify values only in counter[0] through counters[4].
But that's not all.
There's yet another data race issue in your code that's a bit more subtle and might be very difficult to catch even in tests.
To explore it, let's get rid of the "map internals" issue that #kostix mentioned, by imagining that your code is almost exactly the same, but with one tiny change: instead of using a map[int]int, imagine that you're using a []int, initialized to have at least length 56. Something like this:
// THERE'S ANOTHER RACE CONDITION HERE.
// var counters = map[int]int{}
var counters = make([]int, 56)
for i := 0; i < 5; i++ {
// go func(counters map[int]int, th int) {
go func(counters []int, th int) {
for j := 0; j < 5; j++ {
counters[th*10+j]++
}
}(counters, i)
}
fmt.Scanln()
fmt.Println("counters result", counters)
This is nearly equivalent, but gets rid of the "map internals" issue. The goal is to shift the focus away from "map internals" to show you the second issue.
There's still a race condition there. By the way, it's also similar to a race condition that exists in the first attempted solution in another answer you got, that uses a sync.Mutex but in a way that is still wrong.
The problem here is that there's no happens before relationship between the operations that change the counters and the operation that reads from it.
The fmt.Scanln() doesn't help: even though it allows you to introduce an arbitrary time delay between the code right before it (i.e., when the for loop launches the goroutines) and the code right after it (i.e., the fmt.Println()) — so that you could think "Ok, I'm just gonna wait 'a reasonably long amount of time' before pressing Enter", that doesn't eliminate the race condition.
The race condition here arises from the fact that "passage of time" (i.e., you waiting to hit Enter) does not establish a happens-before relationship between the writes to counters and the reads from it.
This notion of happens-before is absolutely fundamental for avoiding data races: you can only guarantee the absence of a data race if you can guarantee the existence of a happens-before relationship between 2 operations.
Like I mentioned, "passage of time" doesn't establish a "happens before". To establish it, you could use one of many alternatives, including primitives in the sync or atomic packages, or channels, etc.
While I'd probably suggest focusing on studying channels, and then the sync package (sync.Mutex, sync.WaitGroup, etc), and maybe only after all that the atomic package, if you do want to read more about this idea of happens before from the authoritative source, here's the link: https://golang.org/ref/mem . But be warned that it's a nasty can of worms.
Hopefully these comments here help you see why it's absolutely fundamental to follow the standard patterns for concurrency in Go. Things can be way more subtle than at first sight.
And to conclude, a quote from The Go Memory Model link I shared above:
If you must read the rest of this document to understand the behavior of your program, you are being too clever.
Don't be clever.
EDIT: for completion, here's how you could solve the problem.
There are 2 parts to the solution: (1) make sure that there's no concurrent modifications to the map; (2) make sure that there's a happens-before between all the changes to the map and the read.
For (1), you can use a sync.Mutex. Lock it before writing, unlock it after the write.
For (2), you need to ensure that the main goroutine can only get to the fmt.Println() after all the modifications are done. And remember: here, after doesn't mean "at a later point in time", but it specifically means that a happens-before relationship must be established. The 2 common patterns to solve this are to use a channel or a sync.WaitGroup. The WaitGroup solution is probably easier to reason about here, so that's what I'd use.
var mu sync.Mutex // (A)
var wg sync.WaitGroup // (A)
var counters = map[int]int{}
wg.Add(5) // (B)
for i := 0; i < 5; i++ {
go func(counters map[int]int, th int) {
for j := 0; j < 5; j++ {
mu.Lock() // (C)
counters[th*10+j]++
mu.Unlock() // (C)
}
wg.Done() // (D)
}(counters, i)
}
wg.Wait() // (E)
fmt.Scanln()
fmt.Println("counters result", counters)
(A) You don't need to initialize either the Mutex nor the WaitGroup, since their zero values are ready to use. Also, you don't need to make them pointers to anything.
(B) You .Add(5) to the WaitGroup's counter, meaning that it will have to wait for 5 .Done() signals before proceeding if you .Wait() on it. The number 5 here is because you're launching 5 goroutines, and you need to establish happens-before relationships between the changes made on all of them and the main goroutine's fmt.Println().
(C) You .Lock() and .Unlock() the Mutex around modifications to the map, to ensure that they are not done concurrently.
(D) Just before each goroutine terminates, you call wg.Done(), which decrements the WaitGroup's internal counter.
(E) Finally, you wg.Wait(). This function blocks until the wg's counter reaches 0. And here's the super important piece: the WaitGroup establishes a happens-before relationship between the calls to wg.Done() and the return of the wg.Wait() call. In other words, from a memory consistency perspective, the main goroutine is guaranteed to see all the changes performed to the map by all the goroutines!
AND FINALLY you can run that code with -race and be happy!
For you to explore further: instead of map + sync.Mutex, you could replace that with just sync.Map. But the sync.WaitGroup would still be necessary. Try to write a solution using that, it might be a nice exercise.
In addition to #kostix answer. You've to know that multiple goroutines should not access (write/read) to the same ressource at a given time.
So, in your implementation you may easly be in the case that multiple goroutines are updating (reading/writing) concurrently the same ressource (which is your map) at the same time.
What should happen ? Which value should be in this given map key ? This a what called race condition
Here is some potential fixes to your code:
Using Mutex:
package main
import (
"fmt"
"sync"
)
func main() {
var counters = map[int]int{}
var mutex = &sync.Mutex{}
for i := 0; i < 3; i++ {
go func(counters map[int]int, th int) {
for j := 0; j < 3; j++ {
mutex.Lock() // Lock the access to the map
counters[th*10+j]++
mutex.Unlock() // Release the access
}
}(counters, i)
}
fmt.Scanln()
fmt.Println("counters result", counters)
}
Output:
counters result map[0:1 1:1 2:1 10:1 11:1 12:1 20:1 21:1 22:1]
Using sync.Map:
package main
import (
"fmt"
"sync"
)
func main() {
var counters sync.Map
for i := 0; i < 3; i++ {
go func(th int) {
for j := 0; j < 3; j++ {
if result, ok := counters.Load(th*10 + j); ok {
value := result.(int) + 1
counters.Store(th*10+j, value+1)
} else {
counters.Store(th*10+j, 1)
}
}
}(i)
}
fmt.Scanln()
counters.Range(func(k, v interface{}) bool {
fmt.Println("key:", k, ", value:", v)
return true
})
}
Output:
key: 21 , value: 1
key: 10 , value: 1
key: 11 , value: 1
key: 0 , value: 1
key: 1 , value: 1
key: 20 , value: 1
key: 2 , value: 1
key: 22 , value: 1
key: 12 , value: 1

Why doesn't this buffered channel block in my code?

I'm learning the Go language. Can someone please explain the output here?
package main
import "fmt"
var c = make(chan int, 1)
func f() {
c <- 1
fmt.Println("In f()")
}
func main() {
go f()
c <- 2
fmt.Println(<-c)
fmt.Println(<-c)
}
Output:
In f()
2
1
Process finished with exit code 0
Why did "In f()" occur before "2"? If "In f()" is printed before "2", the buffered channel should block. But this didn't happen, why?
The other outputs are reasonable.
Image of my confusing result
The order of events that cause this to run through are these:
Trigger goroutine.
Write 2 to the channel. Capacity of the channel is now exhausted.
Read from the channel and output result.
Write 1 to the channel.
Read from the channel and output result.
The order of events that cause this to deadlock are these:
Trigger goroutine.
Write 1 to the channel. Capacity of the channel is now exhausted.
Write 2 to the channel. Since the channel's buffer is full, this blocks.
The output you provide seems to indicate that the goroutine finishes first and the program doesn't deadlock, which contradicts above two scenarios as explanation. Here's what happens though:
Trigger goroutine.
Write 2 to the channel.
Read 2 from the channel.
Write 1 to the channel.
Output In f().
Output the 2 received from the channel.
Read 1 from the channel.
Output the 1 received from the channel.
Keep in mind that you don't have any guarantees concerning the scheduling of goroutines unless you programmatically enforce them. When you start a goroutine, it is undefined when the first code of that goroutine are actually executed and how much the starting code progresses until then. Note that since your code relies on a certain order of events, it is broken by that definition, just to say that explicitly.
Also, at any point in your program, the scheduler can decide to switch between different goroutines. Even the single line fmt.Printtln(<-c) consists of multiple steps and in between each step the switch can occur.
To reproduce block you have to run that code many times the easiest way to do it with test. You don't have block cause of luck. But it is not guaranteed:
var c = make(chan int, 1)
func f() {
c <- 1
fmt.Println("In f()")
}
func TestF(t *testing.T) {
go f()
c <- 2
fmt.Println(<-c)
fmt.Println(<-c)
}
and run with command:
go test -race -count=1000 -run=TestF -timeout=4s
where count number of tests. It reproduces blocking to me.
Provide timeout to not wait default 10 minutes

does go ++ operator need mutex?

Does go ++ Operator need mutex?
It seems that when not using mutex i am losing some data , but by logic ++ just add +1 value to the current value , so even if the order is incorrect still a total of 1000 run should happen no?
Example:
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
i := 0
for r := 0; r < 1000; r++ {
wg.Add(1)
go func() {
i++
fmt.Println(i)
wg.Done()
}()
}
wg.Wait()
fmt.Printf("%d Done", i)
}
To "just add 1 to the current value" the computer needs to read the current value, add 1, and write the new value back. Clearly ordering does matter; the standard example is:
Thread A Thread B
Read: 5
Read: 5
+1 = 6
+1 = 6
Write: 6
Write: 6
The value started at 5, two threads of execution each added one, and the result is 6 (when it should be 7), because B's read occurred before A's write.
But there's a more important misconception at play here: many people think that in the case of a race, the code will either read the old value, or it will read the new value. This is not guaranteed. It might be what happens most of the time. It might be what happens all the time on your computer, with the current version of the compiler, etc. But actually it's possible for code that accesses data in an unsafe/racy manner to produce any result, even complete garbage. There's no guarantee that the value you read from a variable corresponds to any value it has ever had, if you cause a race.
just add +1 value to the current value
No, it's not "just add". It's
Read current value
Compute new value (based on what was read) and write it
See how this can break with multiple concurrent actors?
If you want atomic increments, check out sync/atomic. Examples: https://gobyexample.com/atomic-counters

The variable in Goroutine not changed as expected

The codes are simple as below:
package main
import (
"fmt"
// "sync"
"time"
)
var count = uint64(0)
//var l sync.Mutex
func add() {
for {
// l.Lock()
// fmt.Println("Start ++")
count++
// l.Unlock()
}
}
func main() {
go add()
time.Sleep(1 * time.Second)
fmt.Println("Count =", count)
}
Cases:
Running the code without changing, u will get "Count = 0". Not expected??
Only uncomment line 16 "fmt.Println("Start ++")"; u will get output with lots of "Start ++" and some value with Count like "Count = 11111". Expected??
Only uncomment line 11 "var l sync.Mutex", line 15 "l.Lock()" and line 18 "l.Unlock()" and keep line 16 commented; u will get output like "Count = 111111111". Expected.
So... something wrong with my usage in shared variable...? My question:
Why case 1 had 0 with Count?
If case 1 is expected, why case 2 happened?
Env:
1. go version go1.8 linux/amd64
2. 3.10.0-123.el7.x86_64
3. CentOS Linux release 7.0.1406 (Core)
You have a data race on count. The results are undefined.
package main
import (
"fmt"
// "sync"
"time"
)
var count = uint64(0)
//var l sync.Mutex
func add() {
for {
// l.Lock()
// fmt.Println("Start ++")
count++
// l.Unlock()
}
}
func main() {
go add()
time.Sleep(1 * time.Second)
fmt.Println("Count =", count)
}
Output:
$ go run -race racer.go
==================
WARNING: DATA RACE
Read at 0x0000005995b8 by main goroutine:
runtime.convT2E64()
/home/peter/go/src/runtime/iface.go:255 +0x0
main.main()
/home/peter/gopath/src/so/racer.go:25 +0xb9
Previous write at 0x0000005995b8 by goroutine 6:
main.add()
/home/peter/gopath/src/so/racer.go:17 +0x5c
Goroutine 6 (running) created at:
main.main()
/home/peter/gopath/src/so/racer.go:23 +0x46
==================
Count = 42104672
Found 1 data race(s)
$
References:
Benign data races: what could possibly go wrong?
Without any synchronisation you have no guarantees at all.
There could be multiple reasons, why you see 'Count = 0' in your first case:
You have a multi processor or multi core system and one unit (cpu or core) is happily churning away at the for loop, while the other sleeps for one second and prints the line you are seeing afterwards. It would be completely legal for the compiler to generate machine code, which loads the value into some register and only ever increase that register in the for loop. The memory location can be updated, when the function is done with the variable. In case of an infinite for loop, that ist never. As you, the programmer told the compiler, that there is no contention about that variable, by omitting any synchronisation.
In your mutex version the synchronisation primitives tell the compiler,
that there might be some other thread taking the mutex, so it needs to write back the value from the register to the memory location before unlocking the mutex. At least one can think about it like that. What really happens that the unlock and a later lock operation introduce a happens before relation between the two go routines and this gives the guarantee, that we will see all writes to variables from before the unlock in one thread after the lock operation in the other thread, as described in go memory model locks howsoever this is implemented.
The Go runtime scheduler doesn't run the for loop at all, until the sleep in the main go routine is done. (Isn't likely, but, if I recall correctly, there is not guarantee that this isn't happening.) Sadly there is not much official documentation available about how the scheduler works in go, but it can only schedule a goroutine at certain points, it is not really preemptive. The consequences of this are severe. For example you could make your program run forever in some versions of go, by firing up as many go routines, as you had cores, doing endless for loops only incrementing a variable. There was no core left for the main go routine (which could end the program) and the scheduler can't preempt a go routine in an endless for loop doing only simple stuff, like incrementing a variable. I don't know, if that is changed now.
as others pointed out, that is a data race, google it and read up about it.
The difference between your versions there only line 16 is commented/uncommented is likely only because of run time, as printing to a terminal can be pretty slow.
For a correct program, you need to additionally lock the mutex after your sleep in you main program and before the fmt.Println and unlock it afterwards. But there can't be a deterministic expectation about the output, as the result will vary with machine/os/...

Resources