A question of running mechanism of select statement - go

When I read the docs of select section, there are some things that I really can't understand. With the docs (and the answer on StackOverflow), select will choose one case that could run (or won't block). If there are multiple case, Go will choose one of them random.
So, in my understanding, following cases should be run by random:
for {
select {
case <-time.After(time.Millisecond * 101):
fmt.Println("time out1")
case <-time.After(time.Millisecond * 100):
fmt.Println("time out2")
}
time.Sleep(time.Millisecond * 50)
}
But actually, it always print timed out2
why it only print timed out2, I think the first case also didn't block the program, If the golang could know how many time it will cost of the second case, how about replace this case to the operation which program can not know how many time will cost, like db operate, http request...
So I think, select always return that the fastest return?

time.After() returns a channel on which a value will be sent when the timeout expires.
So you have a select with 2 cases, where both receives from channels are blocking. So select blocks, waits until one of them can proceed.
Your 2 timeout values are different, so the smaller one will be ready to receive from sooner, so that case may be chosen immediately once the timeout expires.
Your loop body ends, and the next iteration begins. You create new timeout channels, so the same thing repeats.

Related

Loop to check condition in concurrent program

I'm reading a book about concurrency in Go (I'm learning it now) and I found this code:
c := sync.NewCond(&sync.Mutex{})
queue := make([]interface{}, 0, 10)
removeFromQueue := func(delay time.Duration) {
time.Sleep(delay)
c.L.Lock()
queue = queue[1:]
fmt.Println("Removed from queue")
c.L.Unlock() c.Signal()
}
for i := 0; i < 10; i++ {
c.L.Lock()
// Why this loop?
for len(queue) == 2 {
c.Wait()
}
fmt.Println("Adding to queue")
queue = append(queue, struct{}{})
go removeFromQueue(1*time.Second)
c.L.Unlock()
}
The problem is that I don't understand why the author introduces the for loop marked by the comment. As far as I can see, the program would be correct without it, but the author says that the loop is there because Cond will signal that something has happened only, but that doesn't mean that the state has truly changed.
In what case could that be possible?
Without the actual book at hand, and instead just some code snippets that seem out of context, it is hard to say what the author had in mind in particular. But we can guess. There is a general point about condition variables in most languages, including Go: waiting for some condition to be satisfied does require a loop in general. In some specific cases, the loop is not required.
The Go documentation is, I think, clearer about this. In particular, the text description for sync's func (c *Cond) Wait() says:
Wait atomically unlocks c.L and suspends execution of the calling goroutine. After later resuming execution, Wait locks c.L before returning. Unlike in other systems, Wait cannot return unless awoken by Broadcast or Signal.
Because c.L is not locked when Wait first resumes, the caller typically cannot assume that the condition is true when Wait returns. Instead, the caller should Wait in a loop:
c.L.Lock()
for !condition() {
c.Wait()
}
... make use of condition ...
c.L.Unlock()
I added bold emphasis to the phrase that explains the reason for the loop.
Whether you can omit the loop depends on more than one thing:
Under what condition(s) does another goroutine invoke Signal and/or Broadcast?
How many goroutines are running, and what might they be doing in parallel?
As the Go documentation says, there's one case we don't have to worry about in Go, that we might in some other systems. In some systems, the equivalent of Wait is sometimes resumed (via the equivalent of Signal) when Signal (or its equivalent) has not actually been invoked on the condition variable.
The queue example you've quoted is particularly odd because there is only one goroutine—the one running the for loop that counts to ten—that can add entries to the queue. The remaining goroutines only remove entries. So if the queue length is 2, and we pause and wait for a signal that the queue length has changed, the queue length can only have changed to either one or zero: no other goroutine can add to it and only the two goroutines we have created at this point can remove from it. This means that given this particular example, we have one of those cases where the loop is not required after all.
(It's also odd in that queue is given an initial capacity of 10, which is as many items as we'll put in, and then we start waiting when its length is exactly 2, so that we should not reach that capacity anyway. If we were to spin off additional goroutines that might add to the queue, the loop that waits while len(queue) == 2 could indeed be signaled by a removal that drops the count from 2 to 1 but not get a chance to resume until insertion occurs, pushing the count back up to 2. However, depending on the situation, that loop might not be resumed until two other goroutines have each added an entry, pushing the count to 3, for instance. So why repeat the loop when the length is exactly two? If the idea is to preserve queue slots, we should loop while the count is greater than or equal to 2.)
(Besides all this, the initial capacity is not relevant as the queue will be dynamically resized to a large slice if necessary.)

How does Go select decide to run into the default branch?

I have a program in which I have 256 goroutines generating test data and sending them to a channel.
In the consuming part of the program, I set up a select like this:
select {
case c := <-theChan:
// Do some stuff with c
default:
//
}
What surprise me is that while the 256 goroutines keep sending items to the channel and the processing of the items take time? The program runs into the default branch several times.
I wonder how does the select statement decide that theChan is empty and run into default.
It depends on the scheduler, but between the time you consume a value from the channel and the time another goroutine gets execution time allocated by the scheduler (which would add a value in the channel), the main goroutine may have enough time to run the case and go back to the select statement before a value is added to the channel, it would then run the default case.
You could reduce this by using a buffered channel.

Priority of case versus default in golang select statements

I have an application with multiple goroutines that are running for loops, and need a way to signal these for loops to break, and to test whether the timeout case occurred. I was looking into using a shared channel with select statements to accomplish this as follows:
// elsewhere in the code, this channel is created, and passed below
done := make(chan struct{})
time.AfterFunc(timeout, func() { close(done) })
...
go func() {
Loop:
for {
select {
case <-done:
break Loop
default:
foo()
time.Sleep(1 * time.Second)
}
}
select {
case <-done:
panic("timed out!")
default:
// ok
}
}()
Is this a valid way to accomplish this? What I'm most concerned about is that the branch of a select that is chosen could be non-deterministic, so that default may be chosen even if one of the cases is ready. Is this possible? Is there any documentation that states that a matching case is guaranteed to have preference over a default. The concern is that the for loop above could loop several times after done is closed and/or report success even though a timeout occurred.
The Go Programming Language Specification
Select statements
Execution of a "select" statement proceeds in several steps:
For all the cases in the statement, the channel operands of receive operations and the channel and right-hand-side expressions of send
statements are evaluated exactly once, in source order, upon entering
the "select" statement. The result is a set of channels to receive
from or send to, and the corresponding values to send. Any side
effects in that evaluation will occur irrespective of which (if any)
communication operation is selected to proceed. Expressions on the
left-hand side of a RecvStmt with a short variable declaration or
assignment are not yet evaluated.
If one or more of the communications can proceed, a single one that can proceed is chosen via a uniform pseudo-random selection.
Otherwise, if there is a default case, that case is chosen. If there
is no default case, the "select" statement blocks until at least one
of the communications can proceed.
Unless the selected case is the default case, the respective communication operation is executed.
If the selected case is a RecvStmt with a short variable declaration or an assignment, the left-hand side expressions are
evaluated and the received value (or values) are assigned.
The statement list of the selected case is executed.
"What I'm most concerned about is that the branch of a select that is
chosen could be non-deterministic, so that default may be chosen even
if one of the cases is ready. Is this possible?"
No. See step 2 of the select specification.

whats the difference between for loop with select and only select?

Could not fully understand it from the docs or google:
What are the differences between the two and in
Which case would you use each one?
for{
select{
case s := <-something:
fmt.Println(s)
case done := <-true:
return
}
}
and
select{
case s := <-something:
fmt.Println(s)
case done := <-true:
return
}
Thanks
Code with loop will keep printing data from channel something until it receives anything on channel done.
Select-only code will either print data from channel something or will quit when it receives anything on channel done. Only one case will be executed. Keep in mind there is no fallthrough in Go select and switch statements by default.
select statement executes through its cases (sending/receiving a the channel) once. If none of its cases are ready to be executed, it blocks until at least one of the case is ready to execute. If more than one cases are ready at the same time, one of the ready case is selected to be executed at random.
So in second case, if there is some data on the something channel, it would be read and put into s. But there is also a chance of true being sent on done while the case s := <-something: will never be executed.
In the first case, you probably want something like this (also in second case):
for{
select{
case s := <-something:
fmt.Println(s)
case <-done: // note the difference
return
}
}
What this now does is that it waits for data on something and also keeps an eye on done. If there is data on something channel (and no data on done), it will be read and put into s (case branch case s := <-something: will be executed with s having the value read from something). This will account for one full execution of select statement and the control will go back to for loop and it will start over again.
If there is no data on something channel, select blocks and waits for data on either something or done. If data arrives on something, similar execution as above happens, otherwise if it arrives in done, the function returns (breaks out of loop). This way some other process can write to done and signal the function containing above for loop to stop processing something and return.
If your program sends data to the 'something' channel a bunch of times you are going to want to repeat the select clause until you receive a done signal.
For example imagine you are running the following routine
.... (some calculations)
something <- x
.... (some calculations)
something <- y
true <- z
If your routine doesn't have the for loop it will only receive the value 'x' and won't receive y or z.

Can data coming from different channels into select statement get ignored?

Is it possible for data coming in through a channel in golang to get ignored if it is not caught at the right moment inside a select statement?
For example, lets say there is this select statement:
for {
select {
case <-timer.C:
//block A
default:
// block B takes 2 seconds.
}
}
If timer ends while block B is running, does block A still run in the next iteration of the loop or does the channel's incoming data get lost?
When the timer expires, it will send the current time on C. If no one is reading from C at the time, the send will block, so it will wait until the value is received. In this case, it will wait till the next iteration of the loop.
Channels are designed to be a synchronization mechanism, so they don't require readers and writers to be already synchronized.

Resources