Go Timer Deadlock on Stop - go

I am trying to reuse timers by stopping and resetting them. I am following the pattern provided by the documentation. Here is a simple example which can be run in go playground that demonstrates the issue I am experiencing.
Is there a correct way to stop and reset a timer that doesn't involve deadlock or race conditions? I am aware that using a select with default involves a race condition on channel message delivery timing and cannot be depended on.
package main
import (
"fmt"
"time"
"sync"
)
func main() {
fmt.Println("Hello, playground")
timer := time.NewTimer(1 * time.Second)
wg := &sync.WaitGroup{}
wg.Add(1)
go func(_wg *sync.WaitGroup) {
<- timer.C
fmt.Println("Timer done")
_wg.Done()
}(wg)
wg.Wait()
fmt.Println("Checking timer")
if !timer.Stop() {
<- timer.C
}
fmt.Println("Done")
}

According to the timer.Stop docs, there is a caveat for draining the channel:
assuming the program has not received from t.C already ...
This cannot be done concurrent to other receives from the Timer's
channel.
Since the channel has already been drained - and will never fire again, the second <-timer.C will block forever.

The question asks, in the first place, why the timer hangs. That's a good question, because even in the absence of bugs in the user program, there is... at least some weird ambiguity in how this thing, called time.Timer, works in Go. The spec says, specifically this:
Stop prevents the Timer from firing. It returns true if the call stops
the timer, false if the timer has already expired or been stopped.
Stop does not close the channel, to prevent a read from the channel
succeeding incorrectly.
To ensure the channel is empty after a call to Stop, check the return
value and drain the channel. For example, assuming the program has not
received from t.C already:
if !t.Stop() {
<-t.C
}
This cannot be done concurrent to other receives from the Timer's
channel or other calls to the Timer's Stop method.
There are very short and precise words, but it may be not that easy to understand them (at least for me). I tried to use the Timer repeatedly in a piece of code, and reset it each time before the next use. Each time I do so, I may want to Stop() it - just for sure. The spec above implies how you should do that, and provides an example - and it may not work! It depends, it depends where you try to apply the Stop idiom. In case you do it after you already in a select-case on this very timer, then it will hang the program.
Specifically, I do not have any concurrent receivers, only a single goroutine. So let's make a simple test program, and try to experiment with it (https://play.golang.org/p/d7BlNReE9Jz):
package main
import (
"fmt"
"time"
)
func main() {
i := 0
d2s := time.Second * 1
i++; fmt.Println(i)
t := time.NewTimer(d2s)
<-t.C
i++; fmt.Println(i)
t.Reset(d2s)
<-t.C
i++; fmt.Println(i)
// if !t.Stop() { <-t.C }
// if !t.Stop() { select { case <-t.C: default: } }
t.Reset(d2s)
<-t.C
i++; fmt.Println(i)
}
This code WORKS. It prints 1,2,3,4, delayed by 1 sec, and that's what it is expected to print. So far so good.
Now, try to un-comment the first commented line. Now the thing: according to spec, it is 100% right (is it?), and must work, but it does not, and hangs. Why? Because, according to spec, it must hang! I already read the channel, and the timer is stopped, so the if fires, and the channel drain op hangs.
Is this a bug? No. Is the spec wrong? No, it's correct. But, it's contrary to what a typical timer user would want. (Maybe a subject for proposal to Go?). All we need, is something like:
t.SafeStopDrain()
Which would do this right, and never hang. But, sadly, it is non-existent.
Here's the life-hack, the workaround, to make this work, is the second commented line. Un-comment it, and that will both work, and do what you wanted - make sure the timer is stopped, channel drained, and the whole thing is fresh anew for re-use.

Related

Go reset a timer.NewTimer within select loop

I have a scenario in which I'm processing events on a channel, and one of those events is a heartbeat which needs to occur within a certain timeframe. Events which are not heartbeats will continue consuming the timer, however whenever the heartbeat is received I want to reset the timer. The obvious way to do this would be by using a time.NewTimer.
For example:
func main() {
to := time.NewTimer(3200 * time.Millisecond)
for {
select {
case event, ok := <-c:
if !ok {
return
} else if event.Msg == "heartbeat" {
to.Reset(3200 * time.Millisecond)
}
case remediate := <-to.C:
fmt.Println("do some stuff ...")
return
}
}
}
Note that a time.Ticker won't work here as the remediation should only be triggered if the heartbeat hasn't been received, not every time.
The above solution works in the handful of low volume tests I've tried it on, however I came across a Github issue indicating that resetting a Timer which has not fired is a no-no. Additionally the documentation states:
Reset should be invoked only on stopped or expired timers with drained channels. If a program has already received a value from t.C, the timer is known to have expired and the channel drained, so t.Reset can be used directly. If a program has not yet received a value from t.C, however, the timer must be stopped and—if Stop reports that the timer expired before being stopped—the channel explicitly drained:
if !t.Stop() {
<-t.C
}
t.Reset(d)
This gives me pause, as it seems to describe exactly what I'm attempting to do. I'm resetting the Timer whenever the heartbeat is received, prior to it having fired. I'm not experienced enough with Go yet to digest the whole post, but it certainly seems like I may be headed down a dangerous path.
One other solution I thought of is to simply replace the Timer with a new one whenever the heartbeat occurs, e.g:
else if event.Msg == "heartbeat" {
to = time.NewTimer(3200 * time.Millisecond)
}
At first I was worried that the rebinding to = time.NewTimer(3200 * time.Millisecond) wouldn't be visible within the select:
For all the cases in the statement, the channel operands of receive operations and the channel and right-hand-side expressions of send statements are evaluated exactly once, in source order, upon entering the "select" statement. The result is a set of channels to receive from or send to, and the corresponding values to send.
But in this particular case since we are inside a loop, I would expect that upon each iteration we re-enter select and therefore the new binding should be visible. Is that a fair assumption?
I realize there are similar questions out there, and I've tried to read the relevant posts/documentation, but I am new to Go just want to be sure I'm understanding things correctly here.
So my questions are:
Is my use of timer.Reset() unsafe, or are the cases mentioned in the Github issue highlighting other problems which are not applicable here? Is the explanation in the docs cryptic or do I just need more experience with Go?
If it is unsafe, is my second proposed solution acceptable (rebinding the timer on each iteration).
ADDENDUM
Upon further reading, most of the pitfalls outlined in the issues are describing scenarios in which the timer has already fired (placing a result on the channel), and subsequent to that firing some other process attempts to Reset it. For this narrow case, I understand the need to test with !t.Stop() since a false return of Stop would indicate the timer has already fired, and as such must be drained prior to calling Reset.
What I still do not understand, is why it is necessary to call t.Stop() prior to t.Reset(), when the Timer has yet to fire. None of the examples go into that as far as I can tell.
What I still do not understand, is why it is necessary to call t.Stop() prior to t.Reset(), when the Timer has yet to fire.
The "when the Timer has yet to fire" bit is critical here. The timer fires within a separate go routine (part of the runtime) and this can happen at any time. You have no way of knowing whether the timer has fired at the time you call to.Reset(3200 * time.Millisecond) (it may even fire while that function is running!).
Here is an example that demonstrates this and is somewhat similar to what you are attempting (based on this):
func main() {
eventC := make(chan struct{}, 1)
go keepaliveLoop(eventC )
// Reset the timer 1000 (approx) times; once every millisecond (approx)
// This should prevent the timer from firing (because that only happens after 2 ms)
for i := 0; i < 1000; i++ {
time.Sleep(time.Millisecond)
// Don't block if there is already a reset request
select {
case eventC <- struct{}{}:
default:
}
}
}
func keepaliveLoop(eventC chan struct{}) {
to := time.NewTimer(2 * time.Millisecond)
for {
select {
case <-eventC:
//if event.Msg == "heartbeat"...
time.Sleep(3 * time.Millisecond) // Simulate reset work (delay could be partly dur to whatever is triggering the
to.Reset(2 * time.Millisecond)
case <-to.C:
panic("this should never happen")
}
}
}
Try it in the playground.
This may appear contrived due to the time.Sleep(3 * time.Millisecond) but that is just included to consistently demonstrate the issue. Your code may work 99.9% of the time but there is always the possibility that both the event and timer channels will fire before the select is run (in which a random case will run) or while the code in the case event, ok := <-c: block is running (including while Reset() is in progress). The result of this happening would be unexpected calls of the remediate code (which may not be a big issue).
Fortunately solving the issue is relatively easy (following the advice in the documentation):
time.Sleep(3 * time.Millisecond) // Simulate reset work (delay could be partly dur to whatever is triggering the
if !to.Stop() {
<-to.C
}
to.Reset(2 * time.Millisecond)
Try this in the playground.
This works because to.Stop returns "true if the call stops the timer, false if the timer has already expired or been stopped". Note that things get a more complicated if the timer is used in multiple go-routines "This cannot be done concurrent to other receives from the Timer's channel or other calls to the Timer's Stop method" but this is not the case in your use-case.
Is my use of timer.Reset() unsafe, or are the cases mentioned in the Github issue highlighting other problems which are not applicable here?
Yes - it is unsafe. However the impact is fairly low. The event arriving and timer triggering would need to happen almost concurrently and, in that case, running the remediate code might not be a big issue. Note that the fix is fairly simple (as per the docs)
If it is unsafe, is my second proposed solution acceptable (rebinding the timer on each iteration).
Your second proposed solution also works (but note that the garbage collector cannot free the timer until after it has fired, or been stopped, which may cause issues if you are creating timers rapidly).
Note: Re the suggestion from #JotaSantos
Another thing that could be done is to add a select when draining <-to.C (on the Stop "if") with a default clause. That would prevent the pause.
See this comment for details of why this may not be a good approach (it's also unnecessary in your situation).
I've faced a similar issue. After reading a lot of information, I came up with a solution that goes along these lines:
package main
import (
"fmt"
"time"
)
func main() {
const timeout = 2 * time.Second
// Prepare a timer that is stopped and ready to be reset.
// Stop will never return false, because an hour is too long
// for timer to fire. Thus there's no need to drain timer.C.
timer := time.NewTimer(timeout)
timer.Stop()
// Make sure to stop the timer when we return.
defer timer.Stop()
// This variable is needed because we need to track if we can safely reset the timer
// in a loop. Calling timer.Stop() will return false on every iteration, but we can only
// drain the timer.C once, otherwise it will deadlock.
var timerSet bool
c := make(chan time.Time)
// Simulate events that come in every second
// and every 5th event delays so that timer can fire.
go func() {
var i int
ticker := time.NewTicker(1 * time.Second)
defer ticker.Stop()
for t := range ticker.C {
i++
if i%5 == 0 {
fmt.Println("Sleeping")
time.Sleep(3 * time.Second)
}
c <- t
if i == 20 {
break
}
}
close(c)
}()
for {
select {
case t, ok := <-c:
if !ok {
fmt.Println("Closed channel")
return
}
fmt.Println("Got event", t, timerSet)
// We got an event, and timer was already set.
// We need to stop the timer and drain the channel if needed,
// so that we can safely reset it later.
if timerSet {
if !timer.Stop() {
<-timer.C
}
timerSet = false
}
// If timer was not set, or it was stopped before, it's safe to reset it.
if !timerSet {
timerSet = true
timer.Reset(timeout)
}
case remediate := <-timer.C:
fmt.Println("Timeout", remediate)
// It's important to store that timer is not set anymore.
timerSet = false
}
}
}
Link to playground: https://play.golang.org/p/0QlujZngEGg

Timer example using timer.Reset() not working as described

I've been working with examples trying to get my first "go routine" running and while I got it running, it won't work as prescribed by the go documentation with the timer.Reset() function.
In my case I believe that the way I am doing it is just fine because I don't actually care what's in the chan buffer, if anything. All as this is meant to do is trigger case <-tmr.C: if anything happened on case _, ok := <-watcher.Events: and then all goes quiet for at least one second. The reason for this is that case _, ok := <-watcher.Events: can get from one to dozens of events microseconds apart and I only care once they are all done and things have settled down again.
However I'm concerned that doing it the way that the documentation says you "must do" doesn't work. If I knew go better I would say the documentation is flawed because it assumes there is something in the buffer when there may not be but I don't know go well enough to have confidence in making that determination so I'm hoping some experts out there can enlighten me.
Below is the code. I haven't put this up on playground because I would have to do some cleaning up (remove calls to other parts of the program) and I'm not sure how I would make it react to filesystem changes for showing it working.
I've clearly marked in the code which alternative works and which doesn't.
func (pm *PluginManager) LoadAndWatchPlugins() error {
// DOING OTHER STUFF HERE
fmt.Println(`m1`)
done := make(chan interface{})
terminated := make(chan interface{})
go pm.watchDir(done, terminated, nil)
fmt.Println(`m2.pre-10`)
time.Sleep(10 * time.Second)
fmt.Println(`m3-post-10`)
go pm.cancelWatchDir(done)
fmt.Println(`m4`)
<-terminated
fmt.Println(`m5`)
os.Exit(0) // Temporary for testing
return Err
}
func (pm *PluginManager) cancelWatchDir(done chan interface{}) {
fmt.Println(`t1`)
time.Sleep(5 * time.Second)
fmt.Println()
fmt.Println(`t2`)
close(done)
}
func (pm *PluginManager) watchDir(done <-chan interface{}, terminated chan interface{}, strings <-chan string) {
watcher, err := fsnotify.NewWatcher()
if err != nil {
Logger("watchDir::"+err.Error(), `plugins`, Error)
}
//err = watcher.Add(pm.pluginDir)
err = watcher.Add(`/srv/plugins/`)
if err != nil {
Logger("watchDir::"+err.Error(), `plugins`, Error)
}
var tmr = time.NewTimer(time.Second)
tmr.Stop()
defer close(terminated)
defer watcher.Close()
defer tmr.Stop()
for {
select {
case <-tmr.C:
fmt.Println(`UPDATE FIRED`)
tmr.Stop()
case _, ok := <-watcher.Events:
if !ok {
return
}
fmt.Println(`Ticker: STOP`)
/*
* START OF ALTERNATIVES
*
* THIS IS BY EXAMPLE AND STATED THAT IT "MUST BE" AT:
* https://golang.org/pkg/time/#Timer.Reset
*
* BUT DOESN'T WORK
*/
if !tmr.Stop() {
fmt.Println(`Ticker: CHAN DRAIN`)
<-tmr.C // STOPS HERE AND GOES NO FURTHER
}
/*
* BUT IF I JUST DO THIS IT WORKS
*/
tmr.Stop()
/*
* END OF ALTERNATIVES
*/
fmt.Println(`Ticker: RESET`)
tmr.Reset(time.Second)
case <-done:
fmt.Println(`DONE TRIGGERED`)
return
}
}
}
Besides what icza said (q.v.), note that the documentation says:
For example, assuming the program has not received from t.C already:
if !t.Stop() {
<-t.C
}
This cannot be done concurrent to other receives from the Timer's channel.
One could argue that this is not a great example since it assumes that the timer was running at the time you called t.Stop. But it does go on to mention that this is a bad idea if there's already some existing goroutine that is or may be reading from t.C.
(The Reset documentation repeats all of this, and kind of in the wrong order because Reset sorts before Stop.)
Essentially, the whole area is a bit fraught. There's no good general answer, because there are at least three possible situations during the return from t.Stop back to your call:
No one is listening to the channel, and no timer-tick is in the channel now. This is often the case if the timer was already stopped before the call to t.Stop. If the timer was already stopped, t.Stop always returns false.
No one is listening to the channel, and a timer-tick is in the channel now. This is always the case when the timer was running but t.Stop was unable to stop it from firing. In this case, t.Stop returns false. It's also the case when the timer was running but fired before you even called t.Stop, and had therefore stopped on its own, so that t.Stop was not able to stop it and returned false.
Someone else is listening to the channel.
In the last situation, you should do nothing. In the first situation, you should do nothing. In the second situation, you probably want to receive from the channel so as to clear it out. That's what their example is for.
One could argue that:
if !t.Stop() {
select {
case <-t.C:
default:
}
}
is a better example. It does one non-blocking attempt that will consume the timer-tick if present, and does nothing if there is no timer-tick. This works whether or not the timer was not actually running when you called t.Stop. Indeed, it even works if t.Stop returns true, though in that case, t.Stop stopped the timer, so the timer never managed to put a timer-tick into the channel. (Thus, if there is a datum in the channel, it must necessarily be left over from a previous failure to clear the channel. If there are no such bugs, the attempt to receive was in turn unnecessary.)
But, if someone else—some other goroutine—is or may be reading the channel, you should not do any of this at all. There is no way to know who (you or them) will get any timer tick that might be in the channel despite the call to Stop.
Meanwhile, if you're not going to use the timer any further, it's relatively harmless just to leave a timer-tick, if there is one, in the channel. It will be garbage-collected when the channel itself is garbage-collected. Of course, whether this is sensible depends on what you are doing with the timer, but in these cases it suffices to just call t.Stop and ignore its return value.
You create a timer and you stop it immediately:
var tmr = time.NewTimer(time.Second)
tmr.Stop()
This doesn't make any sense, I assume this is just an "accident" from your part.
But going further, inside your loop:
case _, ok := <-watcher.Events:
When this happens, you claim this doesn't work:
if !tmr.Stop() {
fmt.Println(`Ticker: CHAN DRAIN`)
<-tmr.C // STOPS HERE AND GOES NO FURTHER
}
Timer.Stop() documents that it returns true if this call stops the timer, and false if the timer has already been stopped (or expired). But your timer was already stopped, right after its creation, so tmr.Stop() returns false properly, so you go inside the if and try to receive from tmr.C, but since the timer was "long" stopped, nothing will be sent on its channel, so this is a blocking (forever) operation.
If you're the one stopping the timer explicitly with timer.Stop(), the recommended "pattern" to drain its channel doesn't make any sense and doesn't work for the 2nd Timer.Stop() call.

Computing mod inverse

I want to compute the inverse element of a prime in modular arithmetic.
In order to speed things up I start a few goroutines which try to find the element in a certain range. When the first one finds the element, it sends it to the main goroutine and at this point I want to terminate the program. So I call close in the main goroutine, but I don't know if the goroutines will finish their execution (I guess not). So a few questions arise:
1) Is this a bad style, should I have something like a WaitGroup?
2) Is there a more idiomatic way to do this computation?
package main
import "fmt"
const (
Procs = 8
P = 1000099
Base = 1<<31 - 1
)
func compute(start, end uint64, finished chan struct{}, output chan uint64) {
for i := start; i < end; i++ {
select {
case <-finished:
return
default:
break
}
if i*P%Base == 1 {
output <- i
}
}
}
func main() {
finished := make(chan struct{})
output := make(chan uint64)
for i := uint64(0); i < Procs; i++ {
start := i * (Base / Procs)
end := (i + 1) * (Base / Procs)
go compute(start, end, finished, output)
}
fmt.Println(<-output)
close(finished)
}
Is there a more idiomatic way to do this computation?
You don't actually need a loop to compute this.
If you use the GCD function (part of the standard library), you get returned numbers x and y such that:
x*P+y*Base=1
this means that x is the answer you want (because x*P = 1 modulo Base):
package main
import (
"fmt"
"math/big"
)
const (
P = 1000099
Base = 1<<31 - 1
)
func main() {
bigP := big.NewInt(P)
bigBase := big.NewInt(Base)
// Compute inverse of bigP modulo bigBase
bigGcd := big.NewInt(0)
bigX := big.NewInt(0)
bigGcd.GCD(bigX,nil,bigP,bigBase)
// x*bigP+y*bigBase=1
// => x*bigP = 1 modulo bigBase
fmt.Println(bigX)
}
Is this a bad style, should I have something like a WaitGroup?
A wait group solves a different problem.
In general, to be a responsible go citizen here and ensure your code runs and tidies up behind itself, you may need to do a combination of:
Signal to the spawned goroutines to stop their calculations when the result of the computation has been found elsewhere.
Ensure a synchronous process waits for the goroutines to stop before returning. This is not mandatory if they properly respond to the signal in #1, but if you don't wait, there will be no guarantee they have terminated before the parent goroutine continues.
In your example program, which performs this task and then quits, there is strictly no need to do either. As this comment indicates, your program's main method terminates upon a satisfactory answer being found, at which point the program will end, any goroutines will be summarily terminated, and the operating system will tidy up any consumed resources. Waiting for goroutines to stop is unnecessary.
However, if you wrapped this code up into a library or it became part of a long running "inverse prime calculation" service, it would be desirable to tidy up the goroutines you spawned to avoid wasting cycles unnecessarily. Additionally, in general, you may have other scenarios in which goroutines store state, hold handles to external resources, or hold handles to internal objects which you risk leaking if not properly tidied away – it is desirable to properly close these.
Communicating the requirement to stop working
There are several approaches to communicate this. I don't claim this is an exhaustive list! (Please do suggest other general-purpose methods in the comments or by proposing edits to the post.)
Using a special channel
Signal the child goroutines by closing a special "shutdown" channel reserved for the purpose. This exploits the channel axiom:
A receive from a closed channel returns the zero value immediately
On receiving from the shutdown channel, the goroutine should immediately arrange to tidy any local state and return from the function. Your earlier question had example code which implemented this; a version of the pattern is:
func myGoRoutine(shutdownChan <-chan struct{}) {
select {
case <-shutdownChan:
// tidy up behaviour goes here
return
// You may choose to listen on other channels here to implement
// the primary behaviour of the goroutine.
}
}
func main() {
shutdownChan := make(chan struct{})
go myGoRoutine(shutdownChan)
// some time later
close(shutdownChan)
}
In this instance, the shutdown logic is wasted because the main() method will immediately return after the call to close. This will race with the shutdown of the goroutine, but we should assume it will not properly execute its tidy-up behaviour. Point 2 addresses ways to fix this.
Using a context
The context package provides the option to create a context which can be cancelled. On cancellation, a channel exposed by the context's Done() method will be closed, which signals time to return from the goroutine.
This approach is approximately the same as the previous method, with the exception of neater encapsulation and the availability of a context to pass to downstream calls in your goroutine to cancel nested calls where desired. Example:
func myGoRoutine(ctx context.Context) {
select {
case <-ctx.Done():
// tidy up behaviour goes here
return
// Put real behaviour for the goroutine here.
}
}
func main() {
// Get a context (or use an existing one if you are provided with one
// outside a `main` method:
ctx := context.Background()
// Create a derived context with a cancellation method
ctx, cancel := context.WithCancel(ctx)
go myGoRoutine(ctx)
// Later, when ready to quit
cancel()
}
This has the same bug as the other case in that the main method will not wait for the child goroutines to quit before returning.
Waiting (or "join"ing) for child goroutines to stop
The code which closes the shutdown channel or closes the context in the above examples will not wait for child goroutines to stop working before continuing. This may be acceptable in some instances, while in others you may require the guarantee that goroutines have stopped before continuing.
sync.WaitGroup can be used to implement this requirement. The documentation is comprehensive. A wait group is a counter which should be incremented using its Add method on starting a goroutine and decremented using its Done method when a goroutine completes. Code can wait for the counter to return to zero by calling its Wait method, which blocks until the condition is true. All calls to Add must occur before a call to Wait.
Example code:
func main() {
var wg sync.WaitGroup
// Increment the WaitGroup with the number of goroutines we're
// spawning.
wg.Add(1)
// It is common to wrap a goroutine in a function which performs
// the decrement on the WaitGroup once the called function returns
// to avoid passing references of this control logic to the
// downstream consumer.
go func() {
// TODO: implement a method to communicate shutdown.
callMyFunction()
wg.Done()
}()
// Indicate shutdown, e.g. by closing a channel or cancelling a
// context.
// Wait for goroutines to stop
wg.Wait()
}
Is there a more idiomatic way to do this computation?
This algorithm is certainly parallelizable through use of goroutines in the manner you have defined. As the work is CPU-bound, the limitation of goroutines to the number of available CPUs makes sense (in the absence of other work on the machine) to benefit from the available compute resource.
See peterSO's answer for a bug fix.

Goroutine only works when fmt.Println is executed

For some reason, when I remove the fmt.Printlns then the code is blocking.
I've got no idea why it happens. All I want to do is to implement a simple concurrency limiter...
I've never experienced such a weird thing. It's like that fmt flushes the variables or something and makes it work.
Also, when I use a regular function instead of a goroutine then it works too.
Here's the following code -
package main
import "fmt"
type ConcurrencyLimit struct {
active int
Limit int
}
func (c *ConcurrencyLimit) Block() {
for {
fmt.Println(c.active, c.Limit)
// If should block
if c.active == c.Limit {
continue
}
c.active++
break
}
}
func (c *ConcurrencyLimit) Decrease() int {
fmt.Println("decrease")
if c.active > 0 {
c.active--
}
return c.active
}
func main() {
c := ConcurrencyLimit{Limit: 1}
c.Block()
go func() {
c.Decrease()
}()
c.Block()
}
Clarification: Even though I've accepted #kaedys 's answer(here) a solution was answered by #Kaveh Shahbazian (here)
You're not giving c.Decrease() a chance to run. c.Block() runs an infinite for loop, but it never blocks in that for loop, just calling continue over and over on every iteration. The main thread spins at 100% usage endlessly.
However, when you add an fmt.Print() call, that makes a syscall, which allows the other goroutine to run.
This post has details on how exactly goroutines yield or are pre-empted. Note, however, that it's slightly out of date, as entering a function now has a random chance to yield that thread to another goroutine, to prevent similar style flooding of threads.
As others have pointed out, Block() will never yield; a goroutine is not a thread. You could use Gosched() in the runtime package to force a yield -- but note that spinning this way in Block() is a pretty terrible idea.
There are much better ways to do concurrency limiting. See http://jmoiron.net/blog/limiting-concurrency-in-go/ for one example
What you are looking for is called a semaphore. You can apply this pattern using channels
http://www.golangpatterns.info/concurrency/semaphores
The idea is that you create a buffered channel of a desired length. Then you make callers acquire the resource by putting a value into the channel and reading it back out when they want to free the resource. Doing so creates proper synchronization points in your program so that the Go scheduler runs correctly.
What you are doing now is spinning the cpu and blocking the Go scheduler. It depends on how many cpus you have available, the version of Go, and the value of GOMAXPROCS. Given the right combination, there may not be another available thread to service other goroutines while you infinitely spin that particular thread.
While other answers pretty much covered the reason (not giving a chance for the goroutine to run) - and I'm not sure what you intend to achieve here - you are mutating a value concurrently without proper synchronization. A rewrite of above code with synchronization considered; would be:
type ConcurrencyLimit struct {
active int
Limit int
cond *sync.Cond
}
func (c *ConcurrencyLimit) Block() {
c.cond.L.Lock()
for c.active == c.Limit {
c.cond.Wait()
}
c.active++
c.cond.L.Unlock()
c.cond.Signal()
}
func (c *ConcurrencyLimit) Decrease() int {
defer c.cond.Signal()
c.cond.L.Lock()
defer c.cond.L.Unlock()
fmt.Println("decrease")
if c.active > 0 {
c.active--
}
return c.active
}
func main() {
c := ConcurrencyLimit{
Limit: 1,
cond: &sync.Cond{L: &sync.Mutex{}},
}
c.Block()
go func() {
c.Decrease()
}()
c.Block()
fmt.Println(c.active, c.Limit)
}
sync.Cond is a synchronization utility designed for times that you want to check if a condition is met, concurrently; while other workers are mutating the data of the condition.
The Lock and Unlock functions work as we expect from a lock. When we are done with checking or mutating, we can call Signal to awake one goroutine (or call Broadcast to awake more than one), so the goroutine knows that is free to act upon the data (or check a condition).
The only part that may seem unusual is the Wait function. It is actually very simple. It is like calling Unlock and instantly call Lock again - with the exception that Wait would not try to lock again, unless triggered by Signal (or Broadcast) in other goroutines; like the workers that are mutating the data (of the condition).

Golang: goroutine infinite-loop

When an fmt.Print() line is removed from the code below, code runs infinitely. Why?
package main
import "fmt"
import "time"
import "sync/atomic"
func main() {
var ops uint64 = 0
for i := 0; i < 50; i++ {
go func() {
for {
atomic.AddUint64(&ops, 1)
fmt.Print()
}
}()
}
time.Sleep(time.Second)
opsFinal := atomic.LoadUint64(&ops)
fmt.Println("ops:", opsFinal)
}
The Go By Example article includes:
// Allow other goroutines to proceed.
runtime.Gosched()
The fmt.Print() plays a similar role, and allows the main() to have a chance to proceed.
A export GOMAXPROCS=2 might help the program to finish even in the case of an infinite loop, as explained in "golang: goroute with select doesn't stop unless I added a fmt.Print()".
fmt.Print() explicitly passes control to some syscall stuff
Yes, go1.2+ has pre-emption in the scheduler
In prior releases, a goroutine that was looping forever could starve out other goroutines on the same thread, a serious problem when GOMAXPROCS provided only one user thread.
In Go 1.2, this is partially addressed: The scheduler is invoked occasionally upon entry to a function. This means that any loop that includes a (non-inlined) function call can be pre-empted, allowing other goroutines to run on the same thread.
Notice the emphasis (that I put): it is possible that in your example the for loop atomic.AddUint64(&ops, 1) is inlined. No pre-emption there.
Update 2017: Go 1.10 will get rid of GOMAXPROCS.

Resources