Data race in Go: Why does it happen below 10-11ms? - go

Here is the code I ran:
package main
import (
"fmt"
"time"
)
const delay = 9 * time.Millisecond
func main() {
n := 0
go func() {
time.Sleep(delay)
n++
}()
fmt.Println(n)
}
Here is the command I used:
go run -race data_race_demo.go
Here is the behavior I noticed:
With delay set to 9ms or lower, data race is always detected (program throws Found 1 data race(s))
With delay set to 12ms or higher, data race is never detected (program simply prints 0)
With delay set to 10 to 11ms, data race occurs intermittently (that is, sometimes prints 0 and sometimes throws Found 1 data race(s))
Why does this happen at around 10-11ms?
I'm using Go 1.16.3 on darwin/amd64, if that matters.

You have 2 goroutines: the main and the one you launch. They access the n variable (and one is a write) without synchronization: that's a data race.
Whether this race is detected depends on whether this racy access occurs. When the main() function ends, your app ends as well, it does not wait for other non-main goroutines to finish.
If you increase the sleep delay, main() will end sooner than the end of sleep and will not wait for the n++ racy write to occur, so nothing is detected. If sleep is short, shorter than the fmt.Prinln() execution time, the racy write occurs and is detected.
There's nothing special about the 10ms. That's just the approximated time it takes to execute fmt.Println() and terminate your app in your environment. If you do other "lengthy" task before the Println() statement, such as this:
for i := 0; i < 5_000_000_000; i++ {
}
fmt.Println(n)
The race will be detected even with 50ms sleep too (because that loop will take some time to execute, allowing the racy write to occur before n is read for the fmt.Println() call and the app terminated). (A simple time.Sleep() would also do, I just didn't want anyone to draw the wrong conclusion that they "interact" with each other somehow.)

Related

Computing mod inverse

I want to compute the inverse element of a prime in modular arithmetic.
In order to speed things up I start a few goroutines which try to find the element in a certain range. When the first one finds the element, it sends it to the main goroutine and at this point I want to terminate the program. So I call close in the main goroutine, but I don't know if the goroutines will finish their execution (I guess not). So a few questions arise:
1) Is this a bad style, should I have something like a WaitGroup?
2) Is there a more idiomatic way to do this computation?
package main
import "fmt"
const (
Procs = 8
P = 1000099
Base = 1<<31 - 1
)
func compute(start, end uint64, finished chan struct{}, output chan uint64) {
for i := start; i < end; i++ {
select {
case <-finished:
return
default:
break
}
if i*P%Base == 1 {
output <- i
}
}
}
func main() {
finished := make(chan struct{})
output := make(chan uint64)
for i := uint64(0); i < Procs; i++ {
start := i * (Base / Procs)
end := (i + 1) * (Base / Procs)
go compute(start, end, finished, output)
}
fmt.Println(<-output)
close(finished)
}
Is there a more idiomatic way to do this computation?
You don't actually need a loop to compute this.
If you use the GCD function (part of the standard library), you get returned numbers x and y such that:
x*P+y*Base=1
this means that x is the answer you want (because x*P = 1 modulo Base):
package main
import (
"fmt"
"math/big"
)
const (
P = 1000099
Base = 1<<31 - 1
)
func main() {
bigP := big.NewInt(P)
bigBase := big.NewInt(Base)
// Compute inverse of bigP modulo bigBase
bigGcd := big.NewInt(0)
bigX := big.NewInt(0)
bigGcd.GCD(bigX,nil,bigP,bigBase)
// x*bigP+y*bigBase=1
// => x*bigP = 1 modulo bigBase
fmt.Println(bigX)
}
Is this a bad style, should I have something like a WaitGroup?
A wait group solves a different problem.
In general, to be a responsible go citizen here and ensure your code runs and tidies up behind itself, you may need to do a combination of:
Signal to the spawned goroutines to stop their calculations when the result of the computation has been found elsewhere.
Ensure a synchronous process waits for the goroutines to stop before returning. This is not mandatory if they properly respond to the signal in #1, but if you don't wait, there will be no guarantee they have terminated before the parent goroutine continues.
In your example program, which performs this task and then quits, there is strictly no need to do either. As this comment indicates, your program's main method terminates upon a satisfactory answer being found, at which point the program will end, any goroutines will be summarily terminated, and the operating system will tidy up any consumed resources. Waiting for goroutines to stop is unnecessary.
However, if you wrapped this code up into a library or it became part of a long running "inverse prime calculation" service, it would be desirable to tidy up the goroutines you spawned to avoid wasting cycles unnecessarily. Additionally, in general, you may have other scenarios in which goroutines store state, hold handles to external resources, or hold handles to internal objects which you risk leaking if not properly tidied away – it is desirable to properly close these.
Communicating the requirement to stop working
There are several approaches to communicate this. I don't claim this is an exhaustive list! (Please do suggest other general-purpose methods in the comments or by proposing edits to the post.)
Using a special channel
Signal the child goroutines by closing a special "shutdown" channel reserved for the purpose. This exploits the channel axiom:
A receive from a closed channel returns the zero value immediately
On receiving from the shutdown channel, the goroutine should immediately arrange to tidy any local state and return from the function. Your earlier question had example code which implemented this; a version of the pattern is:
func myGoRoutine(shutdownChan <-chan struct{}) {
select {
case <-shutdownChan:
// tidy up behaviour goes here
return
// You may choose to listen on other channels here to implement
// the primary behaviour of the goroutine.
}
}
func main() {
shutdownChan := make(chan struct{})
go myGoRoutine(shutdownChan)
// some time later
close(shutdownChan)
}
In this instance, the shutdown logic is wasted because the main() method will immediately return after the call to close. This will race with the shutdown of the goroutine, but we should assume it will not properly execute its tidy-up behaviour. Point 2 addresses ways to fix this.
Using a context
The context package provides the option to create a context which can be cancelled. On cancellation, a channel exposed by the context's Done() method will be closed, which signals time to return from the goroutine.
This approach is approximately the same as the previous method, with the exception of neater encapsulation and the availability of a context to pass to downstream calls in your goroutine to cancel nested calls where desired. Example:
func myGoRoutine(ctx context.Context) {
select {
case <-ctx.Done():
// tidy up behaviour goes here
return
// Put real behaviour for the goroutine here.
}
}
func main() {
// Get a context (or use an existing one if you are provided with one
// outside a `main` method:
ctx := context.Background()
// Create a derived context with a cancellation method
ctx, cancel := context.WithCancel(ctx)
go myGoRoutine(ctx)
// Later, when ready to quit
cancel()
}
This has the same bug as the other case in that the main method will not wait for the child goroutines to quit before returning.
Waiting (or "join"ing) for child goroutines to stop
The code which closes the shutdown channel or closes the context in the above examples will not wait for child goroutines to stop working before continuing. This may be acceptable in some instances, while in others you may require the guarantee that goroutines have stopped before continuing.
sync.WaitGroup can be used to implement this requirement. The documentation is comprehensive. A wait group is a counter which should be incremented using its Add method on starting a goroutine and decremented using its Done method when a goroutine completes. Code can wait for the counter to return to zero by calling its Wait method, which blocks until the condition is true. All calls to Add must occur before a call to Wait.
Example code:
func main() {
var wg sync.WaitGroup
// Increment the WaitGroup with the number of goroutines we're
// spawning.
wg.Add(1)
// It is common to wrap a goroutine in a function which performs
// the decrement on the WaitGroup once the called function returns
// to avoid passing references of this control logic to the
// downstream consumer.
go func() {
// TODO: implement a method to communicate shutdown.
callMyFunction()
wg.Done()
}()
// Indicate shutdown, e.g. by closing a channel or cancelling a
// context.
// Wait for goroutines to stop
wg.Wait()
}
Is there a more idiomatic way to do this computation?
This algorithm is certainly parallelizable through use of goroutines in the manner you have defined. As the work is CPU-bound, the limitation of goroutines to the number of available CPUs makes sense (in the absence of other work on the machine) to benefit from the available compute resource.
See peterSO's answer for a bug fix.

The variable in Goroutine not changed as expected

The codes are simple as below:
package main
import (
"fmt"
// "sync"
"time"
)
var count = uint64(0)
//var l sync.Mutex
func add() {
for {
// l.Lock()
// fmt.Println("Start ++")
count++
// l.Unlock()
}
}
func main() {
go add()
time.Sleep(1 * time.Second)
fmt.Println("Count =", count)
}
Cases:
Running the code without changing, u will get "Count = 0". Not expected??
Only uncomment line 16 "fmt.Println("Start ++")"; u will get output with lots of "Start ++" and some value with Count like "Count = 11111". Expected??
Only uncomment line 11 "var l sync.Mutex", line 15 "l.Lock()" and line 18 "l.Unlock()" and keep line 16 commented; u will get output like "Count = 111111111". Expected.
So... something wrong with my usage in shared variable...? My question:
Why case 1 had 0 with Count?
If case 1 is expected, why case 2 happened?
Env:
1. go version go1.8 linux/amd64
2. 3.10.0-123.el7.x86_64
3. CentOS Linux release 7.0.1406 (Core)
You have a data race on count. The results are undefined.
package main
import (
"fmt"
// "sync"
"time"
)
var count = uint64(0)
//var l sync.Mutex
func add() {
for {
// l.Lock()
// fmt.Println("Start ++")
count++
// l.Unlock()
}
}
func main() {
go add()
time.Sleep(1 * time.Second)
fmt.Println("Count =", count)
}
Output:
$ go run -race racer.go
==================
WARNING: DATA RACE
Read at 0x0000005995b8 by main goroutine:
runtime.convT2E64()
/home/peter/go/src/runtime/iface.go:255 +0x0
main.main()
/home/peter/gopath/src/so/racer.go:25 +0xb9
Previous write at 0x0000005995b8 by goroutine 6:
main.add()
/home/peter/gopath/src/so/racer.go:17 +0x5c
Goroutine 6 (running) created at:
main.main()
/home/peter/gopath/src/so/racer.go:23 +0x46
==================
Count = 42104672
Found 1 data race(s)
$
References:
Benign data races: what could possibly go wrong?
Without any synchronisation you have no guarantees at all.
There could be multiple reasons, why you see 'Count = 0' in your first case:
You have a multi processor or multi core system and one unit (cpu or core) is happily churning away at the for loop, while the other sleeps for one second and prints the line you are seeing afterwards. It would be completely legal for the compiler to generate machine code, which loads the value into some register and only ever increase that register in the for loop. The memory location can be updated, when the function is done with the variable. In case of an infinite for loop, that ist never. As you, the programmer told the compiler, that there is no contention about that variable, by omitting any synchronisation.
In your mutex version the synchronisation primitives tell the compiler,
that there might be some other thread taking the mutex, so it needs to write back the value from the register to the memory location before unlocking the mutex. At least one can think about it like that. What really happens that the unlock and a later lock operation introduce a happens before relation between the two go routines and this gives the guarantee, that we will see all writes to variables from before the unlock in one thread after the lock operation in the other thread, as described in go memory model locks howsoever this is implemented.
The Go runtime scheduler doesn't run the for loop at all, until the sleep in the main go routine is done. (Isn't likely, but, if I recall correctly, there is not guarantee that this isn't happening.) Sadly there is not much official documentation available about how the scheduler works in go, but it can only schedule a goroutine at certain points, it is not really preemptive. The consequences of this are severe. For example you could make your program run forever in some versions of go, by firing up as many go routines, as you had cores, doing endless for loops only incrementing a variable. There was no core left for the main go routine (which could end the program) and the scheduler can't preempt a go routine in an endless for loop doing only simple stuff, like incrementing a variable. I don't know, if that is changed now.
as others pointed out, that is a data race, google it and read up about it.
The difference between your versions there only line 16 is commented/uncommented is likely only because of run time, as printing to a terminal can be pretty slow.
For a correct program, you need to additionally lock the mutex after your sleep in you main program and before the fmt.Println and unlock it afterwards. But there can't be a deterministic expectation about the output, as the result will vary with machine/os/...

Golang Timers with 0 length

I've written a code snipped that creates a timer with a 0 length time, and it does not immediately expire (which is what I expected). A very short sleep call does make it expire, but I'm confused as to why.
The reason I care is that the code using this idea has a snippet that returns 0 on a low probability error, with the idea that the timer should be set to immediately expire, and retry a function. I do not believe that the nanosecond sleep needed here will affect my implementation, but it bothers me.
Did I make a mistake, is this expected behaviour?
Thanks!
package main
import (
"fmt"
"time"
)
func main() {
testTimer := time.NewTimer(time.Duration(0) * time.Millisecond)
fmt.Println(Expired(testTimer))
time.Sleep(time.Nanosecond)
fmt.Println(Expired(testTimer))
}
func Expired(T *time.Timer) bool {
select {
case <-T.C:
return true
default:
return false
}
}
Playground link: https://play.golang.org/p/xLLHoR8aKq
Prints
false
true
time.NewTimer() does not guarantee maximum wait time. It only guarantees a minimum wait time. Quoting from its doc:
NewTimer creates a new Timer that will send the current time on its channel after at least duration d.
So passing a zero duration to time.NewTimer(), it's not a surprise the returned time.Timer is not "expired" immediately.
The returned timer could be "expired" immediately if the implementation would check if the passed duration is zero, and would send a value on the timer's channel before returning it, but it does not. Instead it starts an internal timer normally as it does for any given duration, which will take care of sending a value on its channel, but only some time in the future.
Note that with multiple CPU cores and with runtime.GOMAXPROCS() being greater than 1 there is a slight chance that another goroutine (internal to the time package) sends a value on the timer's channel before NewTimer() returns, but this is a very small chance... Also since this is implementation detail, a future version might add this "optimization" to check for 0 passed duration, and act as you expected it, but as with all implementation details, don't count on it. Count on what's documented, and expect no more.
Go's timer functions guarantee to sleep at least the specified time. See the docs for Sleep and NewTimer respectively:
Sleep pauses the current goroutine for at least the duration d. A negative or zero duration causes Sleep to return immediately.
NewTimer creates a new Timer that will send the current time on its channel after at least duration d.
(emphasis added)
In your situation, you should probably just not use a timer in the situation that you don't want to sleep at all.
This is due to the internal time it takes to set up the timer object. If you'll note in the playground link below the timer does expire at the proper time, but the internal go routine that sets it up and starts it takes longer than your Expire function does to check it.
When the Timer expires, the current time will be sent on C (the channel)
So you'll notice that after it expires, it still sends the original time, because it has expired even before the nanosecond Sleep finished.
https://play.golang.org/p/Ghwq9kJq3J
package main
import (
"fmt"
"time"
)
func main() {
testTimer := time.NewTimer(0 * time.Millisecond)
Expired(testTimer)
time.Sleep(time.Nanosecond)
Expired(testTimer)
n := time.Now()
fmt.Printf("after waiting: %d\n", n.UnixNano())
}
func Expired(T *time.Timer) bool {
select {
case t:= <-T.C:
fmt.Printf("expired %d\n", t.UnixNano())
return true
default:
n := time.Now()
fmt.Printf("not expired: %d\n", n.UnixNano())
return false
}
}

Main thread never yields to goroutine

edit * -- uncomment the two runtime lines and change Tick() to Sleep() and it works as expected, printing one number every second. Leaving code as is so answer/comments make sense.
go version go1.4.2 darwin/amd64
When I run the following, I never see anything printed from go Counter().
package main
import (
"fmt"
"time"
//"runtime"
)
var count int64 = 0
func main() {
//runtime.GOMAXPROCS(2)
fmt.Println("main")
go Counter()
fmt.Println("after Counter()")
for {
count++
}
}
func Counter() {
fmt.Println("In Counter()")
for {
fmt.Println(count)
time.Tick(time.Second)
}
}
> go run a.go
main
after Counter()
If I uncomment the runtime stuff, I will get some strange results like the following all printed at once (not a second apart):
> go run a.go
main
after Counter()
In Counter()
10062
36380
37351
38036
38643
39285
39859
40395
40904
43114
What I expect is that go Counter() will print whatever count is at every second while it is incremented continuously by the main thread. I'm not so much looking for different code to get the same thing done. My question is more about what is going on and where am I wrong in my expectations? The second results in particular don't make any sense, being printed all at once. They should never be printed closer than a second apart as far as I can tell, right?
There is nothing in your tight for loop that lets the Go scheduler run.
The schedule considers other goroutines whenever one blocks or on some (which? all?) function calls.
Since you do neither the easiest solution is to add a call to runtime.Gosched. E.g.:
for {
count++
if count % someNum == 0 {
runtime.Gosched()
}
}
Also note that writing and reading the same variable from different goroutines without locking or synchronization is a data race and there are no benign data races, your program could do anything when reading a value as it's being written. (And synchronization would also let the Go scheduler run.)
For a simple counter you can avoid locks by using atomic.AddInt64(&var, 1) and atomic.LoadInt64(&var).
Further note (as pointed out by #tomasz and completely missed by me), time.Tick doesn't delay anything, you may have wanted time.Sleep or for range time.Tick(…).

Golang: goroutine infinite-loop

When an fmt.Print() line is removed from the code below, code runs infinitely. Why?
package main
import "fmt"
import "time"
import "sync/atomic"
func main() {
var ops uint64 = 0
for i := 0; i < 50; i++ {
go func() {
for {
atomic.AddUint64(&ops, 1)
fmt.Print()
}
}()
}
time.Sleep(time.Second)
opsFinal := atomic.LoadUint64(&ops)
fmt.Println("ops:", opsFinal)
}
The Go By Example article includes:
// Allow other goroutines to proceed.
runtime.Gosched()
The fmt.Print() plays a similar role, and allows the main() to have a chance to proceed.
A export GOMAXPROCS=2 might help the program to finish even in the case of an infinite loop, as explained in "golang: goroute with select doesn't stop unless I added a fmt.Print()".
fmt.Print() explicitly passes control to some syscall stuff
Yes, go1.2+ has pre-emption in the scheduler
In prior releases, a goroutine that was looping forever could starve out other goroutines on the same thread, a serious problem when GOMAXPROCS provided only one user thread.
In Go 1.2, this is partially addressed: The scheduler is invoked occasionally upon entry to a function. This means that any loop that includes a (non-inlined) function call can be pre-empted, allowing other goroutines to run on the same thread.
Notice the emphasis (that I put): it is possible that in your example the for loop atomic.AddUint64(&ops, 1) is inlined. No pre-emption there.
Update 2017: Go 1.10 will get rid of GOMAXPROCS.

Resources