Golang: goroutine infinite-loop - go

When an fmt.Print() line is removed from the code below, code runs infinitely. Why?
package main
import "fmt"
import "time"
import "sync/atomic"
func main() {
var ops uint64 = 0
for i := 0; i < 50; i++ {
go func() {
for {
atomic.AddUint64(&ops, 1)
fmt.Print()
}
}()
}
time.Sleep(time.Second)
opsFinal := atomic.LoadUint64(&ops)
fmt.Println("ops:", opsFinal)
}

The Go By Example article includes:
// Allow other goroutines to proceed.
runtime.Gosched()
The fmt.Print() plays a similar role, and allows the main() to have a chance to proceed.
A export GOMAXPROCS=2 might help the program to finish even in the case of an infinite loop, as explained in "golang: goroute with select doesn't stop unless I added a fmt.Print()".
fmt.Print() explicitly passes control to some syscall stuff
Yes, go1.2+ has pre-emption in the scheduler
In prior releases, a goroutine that was looping forever could starve out other goroutines on the same thread, a serious problem when GOMAXPROCS provided only one user thread.
In Go 1.2, this is partially addressed: The scheduler is invoked occasionally upon entry to a function. This means that any loop that includes a (non-inlined) function call can be pre-empted, allowing other goroutines to run on the same thread.
Notice the emphasis (that I put): it is possible that in your example the for loop atomic.AddUint64(&ops, 1) is inlined. No pre-emption there.
Update 2017: Go 1.10 will get rid of GOMAXPROCS.

Related

Go Timer Deadlock on Stop

I am trying to reuse timers by stopping and resetting them. I am following the pattern provided by the documentation. Here is a simple example which can be run in go playground that demonstrates the issue I am experiencing.
Is there a correct way to stop and reset a timer that doesn't involve deadlock or race conditions? I am aware that using a select with default involves a race condition on channel message delivery timing and cannot be depended on.
package main
import (
"fmt"
"time"
"sync"
)
func main() {
fmt.Println("Hello, playground")
timer := time.NewTimer(1 * time.Second)
wg := &sync.WaitGroup{}
wg.Add(1)
go func(_wg *sync.WaitGroup) {
<- timer.C
fmt.Println("Timer done")
_wg.Done()
}(wg)
wg.Wait()
fmt.Println("Checking timer")
if !timer.Stop() {
<- timer.C
}
fmt.Println("Done")
}
According to the timer.Stop docs, there is a caveat for draining the channel:
assuming the program has not received from t.C already ...
This cannot be done concurrent to other receives from the Timer's
channel.
Since the channel has already been drained - and will never fire again, the second <-timer.C will block forever.
The question asks, in the first place, why the timer hangs. That's a good question, because even in the absence of bugs in the user program, there is... at least some weird ambiguity in how this thing, called time.Timer, works in Go. The spec says, specifically this:
Stop prevents the Timer from firing. It returns true if the call stops
the timer, false if the timer has already expired or been stopped.
Stop does not close the channel, to prevent a read from the channel
succeeding incorrectly.
To ensure the channel is empty after a call to Stop, check the return
value and drain the channel. For example, assuming the program has not
received from t.C already:
if !t.Stop() {
<-t.C
}
This cannot be done concurrent to other receives from the Timer's
channel or other calls to the Timer's Stop method.
There are very short and precise words, but it may be not that easy to understand them (at least for me). I tried to use the Timer repeatedly in a piece of code, and reset it each time before the next use. Each time I do so, I may want to Stop() it - just for sure. The spec above implies how you should do that, and provides an example - and it may not work! It depends, it depends where you try to apply the Stop idiom. In case you do it after you already in a select-case on this very timer, then it will hang the program.
Specifically, I do not have any concurrent receivers, only a single goroutine. So let's make a simple test program, and try to experiment with it (https://play.golang.org/p/d7BlNReE9Jz):
package main
import (
"fmt"
"time"
)
func main() {
i := 0
d2s := time.Second * 1
i++; fmt.Println(i)
t := time.NewTimer(d2s)
<-t.C
i++; fmt.Println(i)
t.Reset(d2s)
<-t.C
i++; fmt.Println(i)
// if !t.Stop() { <-t.C }
// if !t.Stop() { select { case <-t.C: default: } }
t.Reset(d2s)
<-t.C
i++; fmt.Println(i)
}
This code WORKS. It prints 1,2,3,4, delayed by 1 sec, and that's what it is expected to print. So far so good.
Now, try to un-comment the first commented line. Now the thing: according to spec, it is 100% right (is it?), and must work, but it does not, and hangs. Why? Because, according to spec, it must hang! I already read the channel, and the timer is stopped, so the if fires, and the channel drain op hangs.
Is this a bug? No. Is the spec wrong? No, it's correct. But, it's contrary to what a typical timer user would want. (Maybe a subject for proposal to Go?). All we need, is something like:
t.SafeStopDrain()
Which would do this right, and never hang. But, sadly, it is non-existent.
Here's the life-hack, the workaround, to make this work, is the second commented line. Un-comment it, and that will both work, and do what you wanted - make sure the timer is stopped, channel drained, and the whole thing is fresh anew for re-use.

Computing mod inverse

I want to compute the inverse element of a prime in modular arithmetic.
In order to speed things up I start a few goroutines which try to find the element in a certain range. When the first one finds the element, it sends it to the main goroutine and at this point I want to terminate the program. So I call close in the main goroutine, but I don't know if the goroutines will finish their execution (I guess not). So a few questions arise:
1) Is this a bad style, should I have something like a WaitGroup?
2) Is there a more idiomatic way to do this computation?
package main
import "fmt"
const (
Procs = 8
P = 1000099
Base = 1<<31 - 1
)
func compute(start, end uint64, finished chan struct{}, output chan uint64) {
for i := start; i < end; i++ {
select {
case <-finished:
return
default:
break
}
if i*P%Base == 1 {
output <- i
}
}
}
func main() {
finished := make(chan struct{})
output := make(chan uint64)
for i := uint64(0); i < Procs; i++ {
start := i * (Base / Procs)
end := (i + 1) * (Base / Procs)
go compute(start, end, finished, output)
}
fmt.Println(<-output)
close(finished)
}
Is there a more idiomatic way to do this computation?
You don't actually need a loop to compute this.
If you use the GCD function (part of the standard library), you get returned numbers x and y such that:
x*P+y*Base=1
this means that x is the answer you want (because x*P = 1 modulo Base):
package main
import (
"fmt"
"math/big"
)
const (
P = 1000099
Base = 1<<31 - 1
)
func main() {
bigP := big.NewInt(P)
bigBase := big.NewInt(Base)
// Compute inverse of bigP modulo bigBase
bigGcd := big.NewInt(0)
bigX := big.NewInt(0)
bigGcd.GCD(bigX,nil,bigP,bigBase)
// x*bigP+y*bigBase=1
// => x*bigP = 1 modulo bigBase
fmt.Println(bigX)
}
Is this a bad style, should I have something like a WaitGroup?
A wait group solves a different problem.
In general, to be a responsible go citizen here and ensure your code runs and tidies up behind itself, you may need to do a combination of:
Signal to the spawned goroutines to stop their calculations when the result of the computation has been found elsewhere.
Ensure a synchronous process waits for the goroutines to stop before returning. This is not mandatory if they properly respond to the signal in #1, but if you don't wait, there will be no guarantee they have terminated before the parent goroutine continues.
In your example program, which performs this task and then quits, there is strictly no need to do either. As this comment indicates, your program's main method terminates upon a satisfactory answer being found, at which point the program will end, any goroutines will be summarily terminated, and the operating system will tidy up any consumed resources. Waiting for goroutines to stop is unnecessary.
However, if you wrapped this code up into a library or it became part of a long running "inverse prime calculation" service, it would be desirable to tidy up the goroutines you spawned to avoid wasting cycles unnecessarily. Additionally, in general, you may have other scenarios in which goroutines store state, hold handles to external resources, or hold handles to internal objects which you risk leaking if not properly tidied away – it is desirable to properly close these.
Communicating the requirement to stop working
There are several approaches to communicate this. I don't claim this is an exhaustive list! (Please do suggest other general-purpose methods in the comments or by proposing edits to the post.)
Using a special channel
Signal the child goroutines by closing a special "shutdown" channel reserved for the purpose. This exploits the channel axiom:
A receive from a closed channel returns the zero value immediately
On receiving from the shutdown channel, the goroutine should immediately arrange to tidy any local state and return from the function. Your earlier question had example code which implemented this; a version of the pattern is:
func myGoRoutine(shutdownChan <-chan struct{}) {
select {
case <-shutdownChan:
// tidy up behaviour goes here
return
// You may choose to listen on other channels here to implement
// the primary behaviour of the goroutine.
}
}
func main() {
shutdownChan := make(chan struct{})
go myGoRoutine(shutdownChan)
// some time later
close(shutdownChan)
}
In this instance, the shutdown logic is wasted because the main() method will immediately return after the call to close. This will race with the shutdown of the goroutine, but we should assume it will not properly execute its tidy-up behaviour. Point 2 addresses ways to fix this.
Using a context
The context package provides the option to create a context which can be cancelled. On cancellation, a channel exposed by the context's Done() method will be closed, which signals time to return from the goroutine.
This approach is approximately the same as the previous method, with the exception of neater encapsulation and the availability of a context to pass to downstream calls in your goroutine to cancel nested calls where desired. Example:
func myGoRoutine(ctx context.Context) {
select {
case <-ctx.Done():
// tidy up behaviour goes here
return
// Put real behaviour for the goroutine here.
}
}
func main() {
// Get a context (or use an existing one if you are provided with one
// outside a `main` method:
ctx := context.Background()
// Create a derived context with a cancellation method
ctx, cancel := context.WithCancel(ctx)
go myGoRoutine(ctx)
// Later, when ready to quit
cancel()
}
This has the same bug as the other case in that the main method will not wait for the child goroutines to quit before returning.
Waiting (or "join"ing) for child goroutines to stop
The code which closes the shutdown channel or closes the context in the above examples will not wait for child goroutines to stop working before continuing. This may be acceptable in some instances, while in others you may require the guarantee that goroutines have stopped before continuing.
sync.WaitGroup can be used to implement this requirement. The documentation is comprehensive. A wait group is a counter which should be incremented using its Add method on starting a goroutine and decremented using its Done method when a goroutine completes. Code can wait for the counter to return to zero by calling its Wait method, which blocks until the condition is true. All calls to Add must occur before a call to Wait.
Example code:
func main() {
var wg sync.WaitGroup
// Increment the WaitGroup with the number of goroutines we're
// spawning.
wg.Add(1)
// It is common to wrap a goroutine in a function which performs
// the decrement on the WaitGroup once the called function returns
// to avoid passing references of this control logic to the
// downstream consumer.
go func() {
// TODO: implement a method to communicate shutdown.
callMyFunction()
wg.Done()
}()
// Indicate shutdown, e.g. by closing a channel or cancelling a
// context.
// Wait for goroutines to stop
wg.Wait()
}
Is there a more idiomatic way to do this computation?
This algorithm is certainly parallelizable through use of goroutines in the manner you have defined. As the work is CPU-bound, the limitation of goroutines to the number of available CPUs makes sense (in the absence of other work on the machine) to benefit from the available compute resource.
See peterSO's answer for a bug fix.

The variable in Goroutine not changed as expected

The codes are simple as below:
package main
import (
"fmt"
// "sync"
"time"
)
var count = uint64(0)
//var l sync.Mutex
func add() {
for {
// l.Lock()
// fmt.Println("Start ++")
count++
// l.Unlock()
}
}
func main() {
go add()
time.Sleep(1 * time.Second)
fmt.Println("Count =", count)
}
Cases:
Running the code without changing, u will get "Count = 0". Not expected??
Only uncomment line 16 "fmt.Println("Start ++")"; u will get output with lots of "Start ++" and some value with Count like "Count = 11111". Expected??
Only uncomment line 11 "var l sync.Mutex", line 15 "l.Lock()" and line 18 "l.Unlock()" and keep line 16 commented; u will get output like "Count = 111111111". Expected.
So... something wrong with my usage in shared variable...? My question:
Why case 1 had 0 with Count?
If case 1 is expected, why case 2 happened?
Env:
1. go version go1.8 linux/amd64
2. 3.10.0-123.el7.x86_64
3. CentOS Linux release 7.0.1406 (Core)
You have a data race on count. The results are undefined.
package main
import (
"fmt"
// "sync"
"time"
)
var count = uint64(0)
//var l sync.Mutex
func add() {
for {
// l.Lock()
// fmt.Println("Start ++")
count++
// l.Unlock()
}
}
func main() {
go add()
time.Sleep(1 * time.Second)
fmt.Println("Count =", count)
}
Output:
$ go run -race racer.go
==================
WARNING: DATA RACE
Read at 0x0000005995b8 by main goroutine:
runtime.convT2E64()
/home/peter/go/src/runtime/iface.go:255 +0x0
main.main()
/home/peter/gopath/src/so/racer.go:25 +0xb9
Previous write at 0x0000005995b8 by goroutine 6:
main.add()
/home/peter/gopath/src/so/racer.go:17 +0x5c
Goroutine 6 (running) created at:
main.main()
/home/peter/gopath/src/so/racer.go:23 +0x46
==================
Count = 42104672
Found 1 data race(s)
$
References:
Benign data races: what could possibly go wrong?
Without any synchronisation you have no guarantees at all.
There could be multiple reasons, why you see 'Count = 0' in your first case:
You have a multi processor or multi core system and one unit (cpu or core) is happily churning away at the for loop, while the other sleeps for one second and prints the line you are seeing afterwards. It would be completely legal for the compiler to generate machine code, which loads the value into some register and only ever increase that register in the for loop. The memory location can be updated, when the function is done with the variable. In case of an infinite for loop, that ist never. As you, the programmer told the compiler, that there is no contention about that variable, by omitting any synchronisation.
In your mutex version the synchronisation primitives tell the compiler,
that there might be some other thread taking the mutex, so it needs to write back the value from the register to the memory location before unlocking the mutex. At least one can think about it like that. What really happens that the unlock and a later lock operation introduce a happens before relation between the two go routines and this gives the guarantee, that we will see all writes to variables from before the unlock in one thread after the lock operation in the other thread, as described in go memory model locks howsoever this is implemented.
The Go runtime scheduler doesn't run the for loop at all, until the sleep in the main go routine is done. (Isn't likely, but, if I recall correctly, there is not guarantee that this isn't happening.) Sadly there is not much official documentation available about how the scheduler works in go, but it can only schedule a goroutine at certain points, it is not really preemptive. The consequences of this are severe. For example you could make your program run forever in some versions of go, by firing up as many go routines, as you had cores, doing endless for loops only incrementing a variable. There was no core left for the main go routine (which could end the program) and the scheduler can't preempt a go routine in an endless for loop doing only simple stuff, like incrementing a variable. I don't know, if that is changed now.
as others pointed out, that is a data race, google it and read up about it.
The difference between your versions there only line 16 is commented/uncommented is likely only because of run time, as printing to a terminal can be pretty slow.
For a correct program, you need to additionally lock the mutex after your sleep in you main program and before the fmt.Println and unlock it afterwards. But there can't be a deterministic expectation about the output, as the result will vary with machine/os/...

Main thread never yields to goroutine

edit * -- uncomment the two runtime lines and change Tick() to Sleep() and it works as expected, printing one number every second. Leaving code as is so answer/comments make sense.
go version go1.4.2 darwin/amd64
When I run the following, I never see anything printed from go Counter().
package main
import (
"fmt"
"time"
//"runtime"
)
var count int64 = 0
func main() {
//runtime.GOMAXPROCS(2)
fmt.Println("main")
go Counter()
fmt.Println("after Counter()")
for {
count++
}
}
func Counter() {
fmt.Println("In Counter()")
for {
fmt.Println(count)
time.Tick(time.Second)
}
}
> go run a.go
main
after Counter()
If I uncomment the runtime stuff, I will get some strange results like the following all printed at once (not a second apart):
> go run a.go
main
after Counter()
In Counter()
10062
36380
37351
38036
38643
39285
39859
40395
40904
43114
What I expect is that go Counter() will print whatever count is at every second while it is incremented continuously by the main thread. I'm not so much looking for different code to get the same thing done. My question is more about what is going on and where am I wrong in my expectations? The second results in particular don't make any sense, being printed all at once. They should never be printed closer than a second apart as far as I can tell, right?
There is nothing in your tight for loop that lets the Go scheduler run.
The schedule considers other goroutines whenever one blocks or on some (which? all?) function calls.
Since you do neither the easiest solution is to add a call to runtime.Gosched. E.g.:
for {
count++
if count % someNum == 0 {
runtime.Gosched()
}
}
Also note that writing and reading the same variable from different goroutines without locking or synchronization is a data race and there are no benign data races, your program could do anything when reading a value as it's being written. (And synchronization would also let the Go scheduler run.)
For a simple counter you can avoid locks by using atomic.AddInt64(&var, 1) and atomic.LoadInt64(&var).
Further note (as pointed out by #tomasz and completely missed by me), time.Tick doesn't delay anything, you may have wanted time.Sleep or for range time.Tick(…).

Issue with runtime.LockOSThread() and runtime.UnlockOSThread

I have a code like,
Routine 1 {
runtime.LockOSThread()
print something
send int to routine 2
runtime.UnlockOSThread
}
Routine 2 {
runtime.LockOSThread()
print something
send int to routine 1
runtime.UnlockOSThread
}
main {
go Routine1
go Routine2
}
I use run time lock-unlock because, I don't want that printing of
Routine 1 will mix with Routine 2. However, after execution of above
code, it outputs same as without lock-unlock (means printing outputs
mixed). Can anybody help me why this thing happening and how to force
this for happening.
NB: I give an example of print something, however there are lots of
printing and sending events.
If you want to serialize "print something", e.g. each "print something" should perform atomically, then just serialize it.
You can surround "print something" by a mutex. That'll work unless the code deadlock because of that - and surely it easily can in a non trivial program.
The easy way in Go to serialize something is to do it with a channel. Collect in a (go)routine everything which should be printed together. When collection of the print unit is done, send it through a channel to some printing "agent" as a "print job" unit. That agent will simply receive its "tasks" and atomically print each one. One gets that atomicity for free and as an important bonus the code can not deadlock easily no more in the simple case, where there are only non interdependent "print unit" generating goroutines.
I mean something like:
func printer(tasks chan string) {
for s := range tasks {
fmt.Printf(s)
}
}
func someAgentX(tasks chan string) {
var printUnit string
//...
tasks <- printUnit
//...
}
func main() {
//...
tasks := make(chan string, size)
go printer(tasks)
go someAgent1(tasks)
//...
go someAgentN(tasks)
//...
<- allDone
close(tasks)
}
What runtime.LockOSThread does is prevent any other goroutine from running on the same thread. It forces the runtime to create a new thread and run Routine2 there. They are still running concurrently but on different threads.
You need to use sync.Mutex or some channel magic instead.
You rarely need to use runtime.LockOSThread but it can be useful for forcing some higher priority goroutine to run on a thread of it's own.
package main
import (
"fmt"
"sync"
"time"
)
var m sync.Mutex
func printing(s string) {
m.Lock() // Other goroutines will stop here if until m is unlocked
fmt.Println(s)
m.Unlock() // Now another goroutine at "m.Lock()" can continue running
}
func main() {
for i := 0; i < 10; i++ {
go printing(fmt.Sprintf("Goroutine #%d", i))
}
<-time.After(3e9)
}
I think, this is because of runtime.LockOSThread(),runtime.UnlockOSThread does not work all time. It totaly depends on CPU, execution environment etc. It can't be forced by anyother way.

Resources