does go ++ operator need mutex?

does go ++ operator need mutex? - go

Does go ++ Operator need mutex?
It seems that when not using mutex i am losing some data , but by logic ++ just add +1 value to the current value , so even if the order is incorrect still a total of 1000 run should happen no?
Example:
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
i := 0
for r := 0; r < 1000; r++ {
wg.Add(1)
go func() {
i++
fmt.Println(i)
wg.Done()
}()
}
wg.Wait()
fmt.Printf("%d Done", i)
}

To "just add 1 to the current value" the computer needs to read the current value, add 1, and write the new value back. Clearly ordering does matter; the standard example is:
Thread A Thread B
Read: 5
Read: 5
+1 = 6
+1 = 6
Write: 6
Write: 6
The value started at 5, two threads of execution each added one, and the result is 6 (when it should be 7), because B's read occurred before A's write.
But there's a more important misconception at play here: many people think that in the case of a race, the code will either read the old value, or it will read the new value. This is not guaranteed. It might be what happens most of the time. It might be what happens all the time on your computer, with the current version of the compiler, etc. But actually it's possible for code that accesses data in an unsafe/racy manner to produce any result, even complete garbage. There's no guarantee that the value you read from a variable corresponds to any value it has ever had, if you cause a race.

just add +1 value to the current value
No, it's not "just add". It's
Read current value
Compute new value (based on what was read) and write it
See how this can break with multiple concurrent actors?
If you want atomic increments, check out sync/atomic. Examples: https://gobyexample.com/atomic-counters

Related

Doesn't go routine and the channels work in order of call?

Doesn't go routine and the channels worked in the order they were called.
and go routine share values between the region variables?
main.go
var dataSendChannel = make(chan int)
func main() {
a(dataSendChannel)
time.Sleep(time.Second * 10)
}
func a(c chan<- int) {
for i := 0; i < 1000; i++ {
go b(dataSendChannel)
c <- i
}
}
func b(c <-chan int) {
val := <-c
fmt.Println(val)
}
output
> go run main.go
0
1
54
3
61
5
6
7
8
9

Channels are ordered. Goroutines are not. Goroutines may run, or stall, more or less at random, whenever they are logically allowed to run. They must stop and wait whenever you force them to do so, e.g., by attempting to write on a full channel, or using a mutex.Lock() call on an already-locked mutex, or any of those sorts of things.
Your dataSendChannel is unbuffered, so an attempt to write to it will pause until some goroutine is actively attempting to read from it. Function a spins off one goroutine that will attempt one read (go b(...)), then writes and therefore waits for at least one reader to be reading. Function b immediately begins reading, waiting for data. This unblocks function a, which can now write some integer value. Function a can now spin off another instance of b, which begins reading; this may happen before, during, or after the b that got a value begins calling fmt.Println. This second instance of b must now wait for someone—which in this case is always function a, running the loop—to send another value, but a does that as quickly as it can. The second instance of b can now begin calling fmt.Println, but it might, mostly-randomly, not get a chance to do that yet. The first instance of b might already be in fmt.Println, or maybe it isn't yet, and the second one might run first—or maybe both wait around for a while and a third instance of b spins up, reads some value from the channel, and so on.
There's no guarantee which instance of b actually gets into fmt.Println when, so the values you see printed will come out in some semi-random order. If you want the various b instances to sequence themselves, they will need to do that somehow.

What happens when reading or writing concurrently without a mutex

In Go, a sync.Mutex or chan is used to prevent concurrent access of shared objects. However, in some cases I am just interested in the "latest" value of a variable or field of an object.
Or I like to write a value and do not care if another go-routine overwrites it later or has just overwritten it before.
Update: TLDR; Just don't do this. It is not safe. Read the answers, comments, and linked documents!
Update 2021: The Go memory model is going to be specified more thoroughly and there are three great articles by Russ Cox that will teach you more about the surprising effects of unsynchronized memory access. These articles summarize a lot of the below discussions and learnings.
Here are two variants good and bad of an example program, where both seem to produce "correct" output using the current Go runtime:
package main
import (
"flag"
"fmt"
"math/rand"
"time"
)
var bogus = flag.Bool("bogus", false, "use bogus code")
func pause() {
time.Sleep(time.Duration(rand.Uint32()%100) * time.Millisecond)
}
func bad() {
stop := time.After(100 * time.Millisecond)
var name string
// start some producers doing concurrent writes (DANGER!)
for i := 0; i < 10; i++ {
go func(i int) {
pause()
name = fmt.Sprintf("name = %d", i)
}(i)
}
// start consumer that shows the current value every 10ms
go func() {
tick := time.Tick(10 * time.Millisecond)
for {
select {
case <-stop:
return
case <-tick:
fmt.Println("read:", name)
}
}
}()
<-stop
}
func good() {
stop := time.After(100 * time.Millisecond)
names := make(chan string, 10)
// start some producers concurrently writing to a channel (GOOD!)
for i := 0; i < 10; i++ {
go func(i int) {
pause()
names <- fmt.Sprintf("name = %d", i)
}(i)
}
// start consumer that shows the current value every 10ms
go func() {
tick := time.Tick(10 * time.Millisecond)
var name string
for {
select {
case name = <-names:
case <-stop:
return
case <-tick:
fmt.Println("read:", name)
}
}
}()
<-stop
}
func main() {
flag.Parse()
if *bogus {
bad()
} else {
good()
}
}
The expected output is as follows:
...
read: name = 3
read: name = 3
read: name = 5
read: name = 4
...
Any combination of read: and read: name=[0-9] is correct output for this program. Receiving any other string as output would be an error.
When running this program with go run --race bogus.go it is safe.
However, go run --race bogus.go -bogus warns of the concurrent reads and writes.
For map types and when appending to slices I always need a mutex or a similar method of protection to avoid segfaults or unexpected behavior. However, reading and writing literals (atomic values) to variables or field values seems to be safe.
Question: Which Go data types can I safely read and safely write concurrently without a mutext and without producing segfaults and without reading garbage from memory?
Please explain why something is safe or unsafe in Go in your answer.
Update: I rewrote the example to better reflect the original code, where I had the the concurrent writes issue. The important leanings are already in the comments. I will accept an answer that summarizes these learnings with enough detail (esp. on the Go-runtime).

However, in some cases I am just interested in the latest value of a variable or field of an object.
Here is the fundamental problem: What does the word "latest" mean?
Suppoose that, mathematically speaking, we have a sequence of values Xi, with 0 <= i < N. Then obviously Xj is "later than" Xi if j > i. That's a nice simple definition of "latest" and is probably the one you want.
But when two separate CPUs within a single machine—including two goroutines in a Go program—are working at the same time, time itself loses meaning. We cannot say whether i < j, i == j, or i > j. So there is no correct definition for the word latest.
To solve this kind of problem, modern CPU hardware, and Go as a programming language, gives us certain synchronization primitives. If CPUs A and B execute memory fence instructions, or synchronization instructions, or use whatever other hardware provisions exist, the CPUs (and/or some external hardware) will insert whatever is required for the notion of "time" to regain its meaning. That is, if the CPU uses barrier instructions, we can say that a memory load or store that was executed before the barrier is a "before" and a memory load or store that is executed after the barrier is an "after".
(The actual implementation, in some modern hardware, consists of load and store buffers that can rearrange the order in which loads and stores go to memory. The barrier instruction either synchronizes the buffers, or places an actual barrier in them, so that loads and stores cannot move across the barrier. This particular concrete implementation gives an easy way to think about the problem, but isn't complete: you should think of time as simply not existing outside the hardware-provided synchronization, i.e., all loads from, and stores to, some location are happening simultaneously, rather than in some sequential order, except for these barriers.)
In any case, Go's sync package gives you a simple high level access method to these kinds of barriers. Compiled code that executes before a mutex Lock call really does complete before the lock function returns, and the code that executes after the call really does not start until after the lock function returns.
Go's channels provide the same kinds of before/after time guarantees.
Go's sync/atomic package provides much lower level guarantees. In general you should avoid this in favor of the higher level channel or sync.Mutex style guarantees. (Edit to add note: You could use sync/atomic's Pointer operations here, but not with the string type directly, as Go strings are actually implemented as a header containing two separate values: a pointer, and a length. You could solve this with another layer of indirection, by updating a pointer that points to the string object. But before you even consider doing that, you should benchmark the use of the language's preferred methods and verify that these are a problem, because code that works at the sync/atomic level is hard to write and hard to debug.)

Which Go data types can I safely read and safely write concurrently without a mutext and without producing segfaults and without reading garbage from memory?
None.
It really is that simple: You cannot, under no circumstance whatsoever, read and write concurrently to anything in Go.
(Btw: Your "correct" program is not correct, it is racy and even if you get rid of the race condition it would not deterministically produce the output.)

Why can't you use channels
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup // wait group to close channel
var buffer int = 1 // buffer of the channel
// channel to get the share data
cName := make(chan string, buffer)
for i := 0; i < 10; i++ {
wg.Add(1) // add to wait group
go func(i int) {
cName <- fmt.Sprintf("name = %d", i)
wg.Done() // decrease wait group.
}(i)
}
go func() {
wg.Wait() // wait of wait group to be 0
close(cName) // close the channel
}()
// process all the data
for n := range cName {
println("read:", n)
}
}
The above code returns the following output
read: name = 0
read: name = 5
read: name = 1
read: name = 2
read: name = 3
read: name = 4
read: name = 7
read: name = 6
read: name = 8
read: name = 9
https://play.golang.org/p/R4n9ssPMOeS
Article about channels

unclear on reasons why there is a race condition

The question concerns the following code:
package main
import "fmt"
func main() {
var counters = map[int]int{}
for i := 0; i < 5; i++ {
go func(counters map[int]int, th int) {
for j := 0; j < 5; j++ {
counters[th*10+j]++
}
}(counters, i)
}
fmt.Scanln()
fmt.Println("counters result", counters)
}
Here is the output I get when I run this code with go run -race race.go
$ go run -race race.go
==================
WARNING: DATA RACE
Read at 0x00c000092150 by goroutine 8:
runtime.mapaccess1_fast64()
/usr/lib/go-1.13/src/runtime/map_fast64.go:12 +0x0
main.main.func1()
/tmp/race.go:10 +0x6b
Previous write at 0x00c000092150 by goroutine 7:
runtime.mapassign_fast64()
/usr/lib/go-1.13/src/runtime/map_fast64.go:92 +0x0
main.main.func1()
/tmp/race.go:10 +0xaf
Goroutine 8 (running) created at:
main.main()
/tmp/race.go:8 +0x67
Goroutine 7 (finished) created at:
main.main()
/tmp/race.go:8 +0x67
==================
==================
WARNING: DATA RACE
Read at 0x00c0000aa188 by main goroutine:
reflect.typedmemmove()
/usr/lib/go-1.13/src/runtime/mbarrier.go:177 +0x0
reflect.copyVal()
/usr/lib/go-1.13/src/reflect/value.go:1297 +0x7b
reflect.(*MapIter).Value()
/usr/lib/go-1.13/src/reflect/value.go:1251 +0x15e
internal/fmtsort.Sort()
/usr/lib/go-1.13/src/internal/fmtsort/sort.go:61 +0x259
fmt.(*pp).printValue()
/usr/lib/go-1.13/src/fmt/print.go:773 +0x146f
fmt.(*pp).printArg()
/usr/lib/go-1.13/src/fmt/print.go:716 +0x2ee
fmt.(*pp).doPrintln()
/usr/lib/go-1.13/src/fmt/print.go:1173 +0xad
fmt.Fprintln()
/usr/lib/go-1.13/src/fmt/print.go:264 +0x65
main.main()
/usr/lib/go-1.13/src/fmt/print.go:274 +0x13c
Previous write at 0x00c0000aa188 by goroutine 10:
main.main.func1()
/tmp/race.go:10 +0xc4
Goroutine 10 (finished) created at:
main.main()
/tmp/race.go:8 +0x67
==================
counters result map[0:1 1:1 2:1 3:1 4:1 10:1 11:1 12:1 13:1 14:1 20:1 21:1 22:1 23:1 24:1 30:1 31:1 32:1 33:1 34:1 40:1 41:1 42:1 43:1 44:1]
Found 2 data race(s)
exit status 66
Here is what I can't understand. Why there a race condition at all? Aren't we reading/writing values only one go routine can access? For example routine 0 will modify values only in counter[0] through counters[4], routine 1 will modify values only in counters[10] through counters[14], routine 2 will only modify values in counters[20] through counters[24] and so on. I'm not seeing a race condition here. Feels like I'm missing something. Will someone be able to shed some light on this?
Just an FYI I'm new to go. If you could dumb down the explanation (if it is possible) I would appreciate it.

That would be true for an array (or a slice), but a map is a complicated data structure which, among others, have the following properties:
It's free to relocate the elements stored in it in memory at any time it sees fit.
A map is initially empty, and placing an element in it (what appears as assignment in your case) involves a lot of operations on the map's internals.
Additionally, in a case like yours — incrementing an integer stored in a map — is really a map lookup, increment, and a map store.
The first and the last operations involve lookup by key.
Now consider what happens if one goroutine performs lookup at the same time another goroutine modifies the map's internal state when performing map store.
You might want to read up a bit on what is an associative array, and how it's typically implemented.

Aren't we reading/writing values only one go routine can access?
You already got a great answer from #kostix on that matter: the internals of the map are modified when you add elements to it, so it's not accurate to think that routine 0 will modify values only in counter[0] through counters[4].
But that's not all.
There's yet another data race issue in your code that's a bit more subtle and might be very difficult to catch even in tests.
To explore it, let's get rid of the "map internals" issue that #kostix mentioned, by imagining that your code is almost exactly the same, but with one tiny change: instead of using a map[int]int, imagine that you're using a []int, initialized to have at least length 56. Something like this:
// THERE'S ANOTHER RACE CONDITION HERE.
// var counters = map[int]int{}
var counters = make([]int, 56)
for i := 0; i < 5; i++ {
// go func(counters map[int]int, th int) {
go func(counters []int, th int) {
for j := 0; j < 5; j++ {
counters[th*10+j]++
}
}(counters, i)
}
fmt.Scanln()
fmt.Println("counters result", counters)
This is nearly equivalent, but gets rid of the "map internals" issue. The goal is to shift the focus away from "map internals" to show you the second issue.
There's still a race condition there. By the way, it's also similar to a race condition that exists in the first attempted solution in another answer you got, that uses a sync.Mutex but in a way that is still wrong.
The problem here is that there's no happens before relationship between the operations that change the counters and the operation that reads from it.
The fmt.Scanln() doesn't help: even though it allows you to introduce an arbitrary time delay between the code right before it (i.e., when the for loop launches the goroutines) and the code right after it (i.e., the fmt.Println()) — so that you could think "Ok, I'm just gonna wait 'a reasonably long amount of time' before pressing Enter", that doesn't eliminate the race condition.
The race condition here arises from the fact that "passage of time" (i.e., you waiting to hit Enter) does not establish a happens-before relationship between the writes to counters and the reads from it.
This notion of happens-before is absolutely fundamental for avoiding data races: you can only guarantee the absence of a data race if you can guarantee the existence of a happens-before relationship between 2 operations.
Like I mentioned, "passage of time" doesn't establish a "happens before". To establish it, you could use one of many alternatives, including primitives in the sync or atomic packages, or channels, etc.
While I'd probably suggest focusing on studying channels, and then the sync package (sync.Mutex, sync.WaitGroup, etc), and maybe only after all that the atomic package, if you do want to read more about this idea of happens before from the authoritative source, here's the link: https://golang.org/ref/mem . But be warned that it's a nasty can of worms.
Hopefully these comments here help you see why it's absolutely fundamental to follow the standard patterns for concurrency in Go. Things can be way more subtle than at first sight.
And to conclude, a quote from The Go Memory Model link I shared above:
If you must read the rest of this document to understand the behavior of your program, you are being too clever.
Don't be clever.
EDIT: for completion, here's how you could solve the problem.
There are 2 parts to the solution: (1) make sure that there's no concurrent modifications to the map; (2) make sure that there's a happens-before between all the changes to the map and the read.
For (1), you can use a sync.Mutex. Lock it before writing, unlock it after the write.
For (2), you need to ensure that the main goroutine can only get to the fmt.Println() after all the modifications are done. And remember: here, after doesn't mean "at a later point in time", but it specifically means that a happens-before relationship must be established. The 2 common patterns to solve this are to use a channel or a sync.WaitGroup. The WaitGroup solution is probably easier to reason about here, so that's what I'd use.
var mu sync.Mutex // (A)
var wg sync.WaitGroup // (A)
var counters = map[int]int{}
wg.Add(5) // (B)
for i := 0; i < 5; i++ {
go func(counters map[int]int, th int) {
for j := 0; j < 5; j++ {
mu.Lock() // (C)
counters[th*10+j]++
mu.Unlock() // (C)
}
wg.Done() // (D)
}(counters, i)
}
wg.Wait() // (E)
fmt.Scanln()
fmt.Println("counters result", counters)
(A) You don't need to initialize either the Mutex nor the WaitGroup, since their zero values are ready to use. Also, you don't need to make them pointers to anything.
(B) You .Add(5) to the WaitGroup's counter, meaning that it will have to wait for 5 .Done() signals before proceeding if you .Wait() on it. The number 5 here is because you're launching 5 goroutines, and you need to establish happens-before relationships between the changes made on all of them and the main goroutine's fmt.Println().
(C) You .Lock() and .Unlock() the Mutex around modifications to the map, to ensure that they are not done concurrently.
(D) Just before each goroutine terminates, you call wg.Done(), which decrements the WaitGroup's internal counter.
(E) Finally, you wg.Wait(). This function blocks until the wg's counter reaches 0. And here's the super important piece: the WaitGroup establishes a happens-before relationship between the calls to wg.Done() and the return of the wg.Wait() call. In other words, from a memory consistency perspective, the main goroutine is guaranteed to see all the changes performed to the map by all the goroutines!
AND FINALLY you can run that code with -race and be happy!
For you to explore further: instead of map + sync.Mutex, you could replace that with just sync.Map. But the sync.WaitGroup would still be necessary. Try to write a solution using that, it might be a nice exercise.

In addition to #kostix answer. You've to know that multiple goroutines should not access (write/read) to the same ressource at a given time.
So, in your implementation you may easly be in the case that multiple goroutines are updating (reading/writing) concurrently the same ressource (which is your map) at the same time.
What should happen ? Which value should be in this given map key ? This a what called race condition
Here is some potential fixes to your code:
Using Mutex:
package main
import (
"fmt"
"sync"
)
func main() {
var counters = map[int]int{}
var mutex = &sync.Mutex{}
for i := 0; i < 3; i++ {
go func(counters map[int]int, th int) {
for j := 0; j < 3; j++ {
mutex.Lock() // Lock the access to the map
counters[th*10+j]++
mutex.Unlock() // Release the access
}
}(counters, i)
}
fmt.Scanln()
fmt.Println("counters result", counters)
}
Output:
counters result map[0:1 1:1 2:1 10:1 11:1 12:1 20:1 21:1 22:1]
Using sync.Map:
package main
import (
"fmt"
"sync"
)
func main() {
var counters sync.Map
for i := 0; i < 3; i++ {
go func(th int) {
for j := 0; j < 3; j++ {
if result, ok := counters.Load(th*10 + j); ok {
value := result.(int) + 1
counters.Store(th*10+j, value+1)
} else {
counters.Store(th*10+j, 1)
}
}
}(i)
}
fmt.Scanln()
counters.Range(func(k, v interface{}) bool {
fmt.Println("key:", k, ", value:", v)
return true
})
}
Output:
key: 21 , value: 1
key: 10 , value: 1
key: 11 , value: 1
key: 0 , value: 1
key: 1 , value: 1
key: 20 , value: 1
key: 2 , value: 1
key: 22 , value: 1
key: 12 , value: 1

Why race condition with goroutine won't happen some time?

I'm reading go-in-action. This example is from chapter6/listing09.go.
// This sample program demonstrates how to create race
// conditions in our programs. We don't want to do this.
package main
import (
"fmt"
"runtime"
"sync"
)
var (
// counter is a variable incremented by all goroutines.
counter int
// wg is used to wait for the program to finish.
wg sync.WaitGroup
)
// main is the entry point for all Go programs.
func main() {
// Add a count of two, one for each goroutine.
wg.Add(2)
// Create two goroutines.
go incCounter(1)
go incCounter(2)
// Wait for the goroutines to finish.
wg.Wait()
fmt.Println("Final Counter:", counter)
}
// incCounter increments the package level counter variable.
func incCounter(id int) {
// Schedule the call to Done to tell main we are done.
defer wg.Done()
for count := 0; count < 2; count++ {
// Capture the value of Counter.
value := counter
// Yield the thread and be placed back in queue.
runtime.Gosched()
// Increment our local value of Counter.
value++
// Store the value back into Counter.
counter = value
}
}
If you run this code in play.golang.org, it will be 2, same as the book.
But my mac print 4 most of the time, some time 2, some time even 3.
$ go run listing09.go
Final Counter: 2
$ go run listing09.go
Final Counter: 4
$ go run listing09.go
Final Counter: 4
$ go run listing09.go
Final Counter: 4
$ go run listing09.go
Final Counter: 4
$ go run listing09.go
Final Counter: 2
$ go run listing09.go
Final Counter: 4
$ go run listing09.go
Final Counter: 2
$ go run listing09.go
Final Counter: 3
sysinfo
go version go1.8.1 darwin/amd64
macOS sierra
Macbook Pro
Explanation from the book(p140)
Each goroutine overwrites the work of the other. This happens when the goroutine swap is taking place. Each goroutine makes its own copy of the counter variable and then is swapped out for the other goroutine. When the goroutine is given time to exe- cute again, the value of the counter variable has changed, but the goroutine doesn’t update its copy. Instead it continues to increment the copy it has and set the value back to the counter variable, replacing the work the other goroutine performed.
According to this explanation, this code should always print 2.
Why I got 4 and 3? Is it because race condition didn't happen?
Why go playground always get 2?
update:
After I set runtime.GOMAXPROCS(1), it starts to print 2, no 4, some 3.
I guess the play.golang.org is configured to have one logical processor.
The right result 4 without race condition. One logical processor means one thread. GO has same logical processors as the physical cores by default.So,
why one thread(one logical processor) leads to race condition while multiple thread print the right answer?
Can we say the explanation from the book is wrong since we also get 3 and 4 ?
How it get 3 ? 4 is correct.

Race conditions are, by definition, nondeterministic. This means that while you may get a particular answer most of the time, it will not always be so.
By running racy code on multiple cores you greatly increases the number of possibilities, hence you get a broader selection of results.
See this post or this Wikipedia article for more information on race conditions.

Concurrent access to maps with 'range' in Go

The "Go maps in action" entry in the Go blog states:
Maps are not safe for concurrent use: it's not defined what happens when you read and write to them simultaneously. If you need to read from and write to a map from concurrently executing goroutines, the accesses must be mediated by some kind of synchronization mechanism. One common way to protect maps is with sync.RWMutex.
However, one common way to access maps is to iterate over them with the range keyword. It is not clear if for the purposes of concurrent access, execution inside a range loop is a "read", or just the "turnover" phase of that loop. For example, the following code may or may not run afoul of the "no concurrent r/w on maps" rule, depending on the specific semantics / implementation of the range operation:
var testMap map[int]int
testMapLock := make(chan bool, 1)
testMapLock <- true
testMapSequence := 0
...
func WriteTestMap(k, v int) {
<-testMapLock
testMap[k] = v
testMapSequence++
testMapLock<-true
}
func IterateMapKeys(iteratorChannel chan int) error {
<-testMapLock
defer func() {
testMapLock <- true
}
mySeq := testMapSequence
for k, _ := range testMap {
testMapLock <- true
iteratorChannel <- k
<-testMapLock
if mySeq != testMapSequence {
close(iteratorChannel)
return errors.New("concurrent modification")
}
}
return nil
}
The idea here is that the range "iterator" is open when the second function is waiting for a consumer to take the next value, and the writer is not blocked at that time. However, it is never the case that two reads in a single iterator are on either side of a write - this is a "fail fast" iterator, the borrow a Java term.
Is there anything anywhere in the language specification or other documents that indicates if this is a legitimate thing to do, however? I could see it going either way, and the above quoted document is not clear on exactly what consititutes a "read". The documentation seems totally quiet on the concurrency aspects of the for/range statement.
(Please note this question is about the currency of for/range, but not a duplicate of: Golang concurrent map access with range - the use case is completely different and I am asking about the precise locking requirement wrt the 'range' keyword here!)

You are using a for statement with a range expression. Quoting from Spec: For statements:
The range expression is evaluated once before beginning the loop, with one exception: if the range expression is an array or a pointer to an array and at most one iteration variable is present, only the range expression's length is evaluated; if that length is constant, by definition the range expression itself will not be evaluated.
We're ranging over a map, so it's not an exception: the range expression is evaluated only once before beginning the loop. The range expression is simply a map variable testMap:
for k, _ := range testMap {}
The map value does not include the key-value pairs, it only points to a data structure that does. Why is this important? Because the map value is only evaluated once, and if later pairs are added to the map, the map value –evaluated once before the loop– will be a map that still points to a data structure that includes those new pairs. This is in contrast to ranging over a slice (which would be evaluated once too), which is also only a header pointing to a backing array holding the elements; but if elements are added to the slice during the iteration, even if that does not result in allocating and copying over to a new backing array, they will not be included in the iteration (because the slice header also contains the length - already evaluated). Appending elements to a slice may result in a new slice value, but adding pairs to a map will not result in a new map value.
Now on to iteration:
for k, v := range testMap {
t1 := time.Now()
someFunction()
t2 := time.Now()
}
Before we enter into the block, before the t1 := time.Now() line k and v variables are holding the values of the iteration, they are already read out from the map (else they couldn't hold the values). Question: do you think the map is read by the for ... range statement between t1 and t2? Under what circumstances could that happen? We have here a single goroutine that is executing someFunc(). To be able to access the map by the for statement, that would either require another goroutine, or it would require to suspend someFunc(). Obviously neither of those happen. (The for ... range construct is not a multi-goroutine monster.) No matter how many iterations there are, while someFunc() is executed, the map is not accessed by the for statement.
So to answer one of your questions: the map is not accessed inside the for block when executing an iteration, but it is accessed when the k and v values are set (assigned) for the next iteration. This implies that the following iteration over the map is safe for concurrent access:
var (
testMap = make(map[int]int)
testMapLock = &sync.RWMutex{}
)
func IterateMapKeys(iteratorChannel chan int) error {
testMapLock.RLock()
defer testMapLock.RUnlock()
for k, v := range testMap {
testMapLock.RUnlock()
someFunc()
testMapLock.RLock()
if someCond {
return someErr
}
}
return nil
}
Note that unlocking in IterateMapKeys() should (must) happen as a deferred statement, as in your original code you may return "early" with an error, in which case you didn't unlock, which means the map remained locked! (Here modeled by if someCond {...}).
Also note that this type of locking only ensures locking in case of concurrent access. It does not prevent a concurrent goroutine to modify (e.g. add a new pair) the map. The modification (if properly guarded with write lock) will be safe, and the loop may continue, but there is no guarantee that the for loop will iterate over the new pair:
If map entries that have not yet been reached are removed during iteration, the corresponding iteration values will not be produced. If map entries are created during iteration, that entry may be produced during the iteration or may be skipped. The choice may vary for each entry created and from one iteration to the next.
The write-lock-guarded modification may look like this:
func WriteTestMap(k, v int) {
testMapLock.Lock()
defer testMapLock.Unlock()
testMap[k] = v
}
Now if you release the read lock in the block of the for, a concurrent goroutine is free to grab the write lock and make modifications to the map. In your code:
testMapLock <- true
iteratorChannel <- k
<-testMapLock
When sending k on the iteratorChannel, a concurrent goroutine may modify the map. This is not just an "unlucky" scenario, sending a value on a channel is often a "blocking" operation, if the channel's buffer is full, another goroutine must be ready to receive in order for the send operation to proceed. Sending a value on a channel is a good scheduling point for the runtime to run other goroutines even on the same OS thread, not to mention if there are multiple OS threads, of which one may already be "waiting" for the write lock in order to carry out a map modification.
To sum the last part: you releasing the read lock inside the for block is like yelling to others: "Come, modify the map now if you dare!" Consequently in your code encountering that mySeq != testMapSequence is very likely. See this runnable example to demonstrate it (it's a variation of your example):
package main
import (
"fmt"
"math/rand"
"sync"
)
var (
testMap = make(map[int]int)
testMapLock = &sync.RWMutex{}
testMapSequence int
)
func main() {
go func() {
for {
k := rand.Intn(10000)
WriteTestMap(k, 1)
}
}()
ic := make(chan int)
go func() {
for _ = range ic {
}
}()
for {
if err := IterateMapKeys(ic); err != nil {
fmt.Println(err)
}
}
}
func WriteTestMap(k, v int) {
testMapLock.Lock()
defer testMapLock.Unlock()
testMap[k] = v
testMapSequence++
}
func IterateMapKeys(iteratorChannel chan int) error {
testMapLock.RLock()
defer testMapLock.RUnlock()
mySeq := testMapSequence
for k, _ := range testMap {
testMapLock.RUnlock()
iteratorChannel <- k
testMapLock.RLock()
if mySeq != testMapSequence {
//close(iteratorChannel)
return fmt.Errorf("concurrent modification %d", testMapSequence)
}
}
return nil
}
Example output:
concurrent modification 24
concurrent modification 41
concurrent modification 463
concurrent modification 477
concurrent modification 482
concurrent modification 496
concurrent modification 508
concurrent modification 521
concurrent modification 525
concurrent modification 535
concurrent modification 541
concurrent modification 555
concurrent modification 561
concurrent modification 565
concurrent modification 570
concurrent modification 577
concurrent modification 591
concurrent modification 593
We're encountering concurrent modification quite often!
Do you want to avoid this kind of concurrent modification? The solution is quite simple: don't release the read lock inside the for. Also run your app with the -race option to detect race conditions: go run -race testmap.go
Final thoughts
The language spec clearly allows you to modify the map in the same goroutine while ranging over it, this is what the previous quote relates to ("If map entries that have not yet been reached are removed during iteration.... If map entries are created during iteration..."). Modifying the map in the same goroutine is allowed and is safe, but how it is handled by the iterator logic is not defined.
If the map is modified in another goroutine, if you use proper synchronization, The Go Memory Model guarantees that the goroutine with the for ... range will observe all modifications, and the iterator logic will see it just as if "its own" goroutine would have modified it – which is allowed as stated before.

The unit of concurrent access for a for range loop over a map is the map. Go maps in action.
A map is a dynamic data structure that changes for inserts, updates and deletes. Inside the Map Implementation. For example,
The iteration order over maps is not specified and is not guaranteed
to be the same from one iteration to the next. If map entries that
have not yet been reached are removed during iteration, the
corresponding iteration values will not be produced. If map entries
are created during iteration, that entry may be produced during the
iteration or may be skipped. The choice may vary for each entry
created and from one iteration to the next. If the map is nil, the
number of iterations is 0. For statements, The Go Programming
Language Specification
Reading a map with a for range loop with interleaved inserts, updates and deletes is unlikely to be useful.
Lock the map:
package main
import (
"sync"
)
var racer map[int]int
var race sync.RWMutex
func Reader() {
race.RLock() // Lock map
for k, v := range racer {
_, _ = k, v
}
race.RUnlock()
}
func Write() {
for i := 0; i < 1e6; i++ {
race.Lock()
racer[i/2] = i
race.Unlock()
}
}
func main() {
racer = make(map[int]int)
Write()
go Write()
Reader()
}
Don't lock after the read -- fatal error: concurrent map iteration and map write:
package main
import (
"sync"
)
var racer map[int]int
var race sync.RWMutex
func Reader() {
for k, v := range racer {
race.RLock() // Lock after read
_, _ = k, v
race.RUnlock()
}
}
func Write() {
for i := 0; i < 1e6; i++ {
race.Lock()
racer[i/2] = i
race.Unlock()
}
}
func main() {
racer = make(map[int]int)
Write()
go Write()
Reader()
}
Use the Go Data Race Detector. Read Introducing the Go Race Detector.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

does go ++ operator need mutex? - go

Related

Doesn't go routine and the channels work in order of call?

What happens when reading or writing concurrently without a mutex

unclear on reasons why there is a race condition

Why race condition with goroutine won't happen some time?

Concurrent access to maps with 'range' in Go

Categories

Resources