Gogoutine schedule algorithm - go

package main
import ()
func main() {
msgQueue := make(chan int, 1000000)
netAddr := "127.0.0.1"
token := make(chan int, 10)
for i := 0; i < 10; i++ {
token <- i
}
go RecvReq(netAddr, msgQueue)
for {
select {
case req := <-msgQueue:
go HandleReq(req, token)
}
}
}
func RecvReq(addr string,msgQueue chan int){
msgQueue<-//get from network
}
func HandleReq(msg int, token chan int) {
//step 1
t := <-token
//step 2
//codo here...(don't call runtime.parkļ¼‰
//step 3
//code here...(may call runtime.park)
//step 4
token <- t
}
System: 1cpu 2core
Go version:go1.3 linux/amd64
Problem description:
msgQueue revc request all the time by RecvReq,then the main goroutine create new goroutine all the time,but the waiting goroutine wait all the time.The first 10 goroutines stop at step 3,new goroutines followed stop at step 1.
Q1:How to make the waiting goroutine to run when new goroutine is being created all the time.
Q2:How to balance RevcReq and HandleReq? Revc msg rate is 10 times faster than Handle msg.

Alas this is not very clear from your question. But there are several issues here.
You create a buffered channel of size n then insert n items into it. Don't do this - or to be clearer, don't do this until you know it's needed. Buffered channels usually fall into the 'premature optimisation' category. Start with unbuffered channels so you can work out how the goroutines co-operate. When it's working (free of deadlocks), measure the performance, add buffering, try again.
Your select has only one guard. So it behaves just like the select wasn't there and the case body was the only code there.
You are trying to spawn off new goroutines for every message. Is this really what you wanted? You may find you can use a static mesh of goroutines, perhaps 10 in your case, and the result may be a program in which the intent is clearer. It would also give a small saving because the runtime would not have to spawn and clean up goroutines dynamically (however, you should be concerned with correct behaviour first, before worrying about any inefficiencies).
Your RecvReq is missing from the playground example, which is not executable.

Related

Recursive calls from function started as goroutine & Idiomatic way to continue caller when all worker goroutines finished

I am implementing a (sort of a) combinatorial backtracking algorithm in go utilising goroutines. My problem can be represented as a tree with a certain degree/spread where I want to visit each leaf and calculate a result depending on the path taken. On a given level, I want to spawn goroutines to process the subproblems concurrently, i.e. if I have a tree with degree 3 and I want to start the concurrency after level 2, I'd spawn 3*3=9 goroutines that proceed with processing the subproblems concurrently.
func main() {
cRes := make(chan string, 100)
res := []string{}
numLevels := 5
spread := 3
startConcurrencyAtLevel := 2
nTree("", numLevels, spread, startConcurrencyAtLevel, cRes)
for {
select {
case r := <-cRes:
res = append(res, r)
case <-time.After(10 * time.Second):
fmt.Println("Caculation timed out")
fmt.Println(len(res), math.Pow(float64(spread), float64(numLevels)))
return
}
}
}
func nTree(path string, maxLevels int, spread int, startConcurrencyAtLevel int, cRes chan string) {
if len(path) == maxLevels {
// some longer running task here associated with the found path, also using a lookup table
// real problem actually returns not the path but the result if it satisfies some condition
cRes <- path
return
}
for i := 1; i <= spread; i++ {
nextPath := path + fmt.Sprint(i)
if len(path) == startConcurrencyAtLevel {
go nTree(nextPath, maxLevels, spread, startConcurrencyAtLevel, cRes)
} else {
nTree(nextPath, maxLevels, spread, startConcurrencyAtLevel, cRes)
}
}
}
The above code works, however I rely on the for select statement timing out. I am looking for a way to continue with main() as soon as all goroutines have finished, i.e. all subproblems have been processed.
I already came up with two possible (unpreferred/unelegant) solutions:
Using a mutex protected result map + a waitgroup instead of a channel-based approach should do the trick, but I'm curious if there is a neat solution with channels.
Using a quit channel (of type int). Every time a goroutine is spawned, the quit channel gets a +1 int, everytime a comptutation finished in a leaf, it gets a -1 int and the caller sums up the values. See the following snippet, this however is not a good solution as it (rather blatantly) runs into timing issues I don't want to deal with. It quits prematurely if for instance the first goroutine finishes before another one has been spawned.
for {
select {
case q := <-cRunningRoutines:
runningRoutines += q
if runningRoutines == 0 {
fmt.Println("Calculation complete")
return res
}
// ...same cases as above
}
Playground: https://go.dev/play/p/9jzeCvl8Clj
Following questions:
Is doing recursive calls from a function started as a goroutine to itself a valid approach?
What would be an idiomatic way of reading the results from cRes until all spawned goroutines finish? I read somewhere that channels should be closed when computation is done, but I just cant wrap my head around how to integrate it in this case.
Happy about any ideas, thanks!
reading the description and the snippet I am not able to understand exactly what you are trying to achieve, but I have some hints and patterns for channels that I use daily and think are helpful.
the context package is very helpful to manage goroutines' state in a safe way. In your example, time.After is used to end the main program, but in non-main functions it could be leaking goroutines: if instead you use context.Context and pass it into the gorotuines (it's usually passed first arg of a function) you will be able to control cancellation of downstream calls. This explains it briefly.
it is common practice to create channels (and return them) in functions that produce messages and send them in the channel. The same function should be responsible of closing the channel, e,g, with defer close(channel) when it's done writing.
This is handy because when the channel is buffered, closing it is possible even when it still has data in it: the Go runtime will actually wait until all messages are polledbefore closing. For unbuffered channels, the function won't be able to send a message over the channel until a reader of the channel is ready to poll it, thus won;t be able to exit.
This is an example (without recursion).
We can close the channel both when it is buffered or unbuffered in this example, because the send will block until the for := range on the channel in the main goroutine reads from it.
This is a variant for the same principle, with the channel passed as argument.
we can use sync.WaitGroup in tandem with channels, to signal completion for individual goroutines, and let know to an "orchestrating" goroutine that the channel can be closed, because all message producers are done sending data into the channel. The same considerations as point 1 apply on the close operation.
This is an example showing the use of waitGroup and external closer of channel.
channels can have a direction! notice that in the example, I added/removed arrows next to the channel (e.g. <-chan string, or chan<- string) when passing them in/outside functions. This tells the compiler that a channel is read or write only respectively in the scope of that function.
This is helping in 2 ways:
the compiler will produce more efficient code, as the channels with direction will have a single lock instead of 2.
the signature of the function describes if it will only use the channel for writing to it (and possibly close()) or reading: remember that reading from a channel with a range automatically stops the iteration when the channel is closed.
you can build channels of channels: make(chan chan string) is a valid (and helpful) construct to build processing pipelines.
A common usage of it is a fan-in goroutine that is collecting multiple outputs of a series of channel-producing goroutines.
This is an example of how to use them.
In essence, to answer your initial questions:
Is doing recursive calls from a function started as a goroutine to itself a valid approach?
If you really need recursion, it's probably better to handle it separately from the concurrent code: create a dedicated function that recursively sends data into a channel, and orchestrate the closing of the channel in the caller.
What would be an idiomatic way of reading the results from cRes until all spawned goroutines finish? I read somewhere that channels should be closed when computation is done, but I just cant wrap my head around how to integrate it in this case.
A good reference is Go Concurrency Patterns: Pipelines and cancellation: this is a rather old post (before the context package existedin the std lib) and I think Parallel digestion is what you're looking for to address the original question.
As mentioned by torek, I spun off an anonymous function closing the channel after the waitgroup finished waiting. Also needed some logic around calling the wg.Done() of the spawned goroutines only after the the recursion of the goroutine spawning level returns.
Generally I think this is a useful idiom (correct me if I'm wrong :))
Playground: https://go.dev/play/p/bQjHENsZL25
func main() {
cRes := make(chan string, 100)
numLevels := 3
spread := 3
startConcurrencyAtLevel := 2
var wg sync.WaitGroup
nTree("", numLevels, spread, startConcurrencyAtLevel, cRes, &wg)
go func() {
// time.Sleep(1 * time.Second) // edit: code should work without this initial sleep
wg.Wait()
close(cRes)
}()
for r := range cRes {
fmt.Println(r)
}
fmt.Println("Done!")
}
func nTree(path string, maxLevels int, spread int, startConcurrencyAtLevel int, cRes chan string, wg *sync.WaitGroup) {
if len(path) == maxLevels {
// some longer running task here associated with the found path
cRes <- path
return
}
for i := 1; i <= spread; i++ {
nextPath := path + fmt.Sprint(i)
if len(path) == startConcurrencyAtLevel {
go nTree(nextPath, maxLevels, spread, startConcurrencyAtLevel, cRes, wg)
} else {
nTree(nextPath, maxLevels, spread, startConcurrencyAtLevel, cRes, wg)
}
}
}

Golang intermittent behaviour on timed out Goroutine

I am trying to implement concurrency for repetitive task. I want to implement an http request on a different Goroutine (pictured by longRunningTask function). I provide a timer for a mechanism to stop the Goroutine and sends a timeout signal to the main Goroutine if the heavy load task proceed the predefined timeout. The problem that I currently have is that I am getting intermittent behaviour.
The code has been simplified to look like below.
package main
import (
"fmt"
"time"
)
func main() {
var iteration int = 5
timeOutChan := make(chan struct{})
resultChan := make(chan string)
for i := 0; i < iteration; i++ {
go longRunningTaks(timeOutChan, resultChan)
}
for i := 0; i < iteration; i++ {
select {
case data := <-resultChan:
fmt.Println(data)
case <-timeOutChan:
fmt.Println("timed out")
}
}
}
func longRunningTaks(tc chan struct{}, rc chan string) {
timer := time.NewTimer(time.Nanosecond * 1)
defer timer.Stop()
// Heavy load task
time.Sleep(time.Second * 1)
select {
case <-timer.C:
tc <- struct{}{}
case rc <- "success":
return
}
}
I believe every tries should be printing out
timeout
timeout
timeout
timeout
timeout
Instead I got an intermittent
success
timeout
timeout
timeout
timeout
The doc mentions:
NewTimer creates a new Timer that will send the current time on its
channel after at least duration d.
"at least means" timer will take specified time for sure, however this also implicitly means can take more time than specified. Timer starts its own go routine and write to channel on expiry.
Because of scheduler or garbage collection or processes of writing to other channel can get delayed. Besides simulated work load is very short considering above possibilities.
Update:
As Peter mentioned in comment writing "success" to rc channel is action which is equally likely to complete because that can be read from the other end by main routine. The select has to choose between 1) writing "success" to rc channel & 2) expired timer. And both are possible.
The likelihood of No1 is more in the beginning because the main routine is yet to read it from other end. Once that happens. Other remaining routines will have to compete for the channel (to write "success") (since it is blocking with buffer size 0) so for rest of the times the likelihood of expired timer getting selected is more as cannot say how fast the main routine will read from the resultChan channel (other end of rc).

Spread sequential tests into 4 go routines and terminate all if one fails

Suppose I have a simple loop which does sequential tests like this.
for f := 1; f <= 1000; f++ {
if doTest(f) {
break
}
}
I loop through range of numbers and do a test for each number. If test fails for one number, I break and exit the main thread. Simple enough.
Now, how do correctly feed the test numbers in say four or several go routines. Basically, I want to test the numbers from 1 to 1000 in batches of 4 (or whatever number of go routines is).
Do I create 4 routines reading from one channel and feed the numbers sequentially into this channel? Or do I make 4 routines with an individual channel?
And another question. How do I stop all 4 routines if one of them fails the test? I've been reading some texts on channels but I cannot put the pieces together.
You can create a producer/consumer system: https://play.golang.org/p/rks0gB3aDb
func main() {
ch := make(chan int)
clients := 4
// make it buffered, so all clients can fail without hanging
notifyCh := make(chan struct{}, clients)
go produce(100, ch, notifyCh)
var wg sync.WaitGroup
wg.Add(clients)
for i := 0; i < clients; i++ {
go func() {
consumer(ch, notifyCh)
wg.Done()
}()
}
wg.Wait()
}
func consumer(in chan int, notifyCh chan struct{}) {
fmt.Printf("Start consumer\n")
for i := range in {
<-time.After(100 * time.Millisecond)
if i == 42 {
fmt.Printf("%d fails\n", i)
notifyCh <- struct{}{}
return
} else {
fmt.Printf("%d\n", i)
}
}
fmt.Printf("Consumer stopped working\n")
}
func produce(N int, out chan int, notifyCh chan struct{}) {
for i := 0; i < N; i++ {
select {
case out <- i:
case <-notifyCh:
close(out)
return
}
}
close(out)
}
The producer pushes numbers from 0 to 99 to the channel, the consumer consumes until the channel is closed. In main we create 4 clients and add them to a waitgroup to reliably check if every goroutine returned.
Every consumer can signal on the notifyCh, the producer stops working and no further numbers are generated, therefor all consumers return after their current number.
There's also an option to create 4 go routines, wait for all of them to return, start the next 4 go routines. But this adds quite an overhead on waiting.
Since you mentioned prime numbers, here's a really cool prime seive: https://golang.org/doc/play/sieve.go
Whether you will create one channel common or a channel per routines depend on what you want.
If you want only put some numbers (or more general - requests) inside and you don't care which goroutine serve that, than of course is better to share a channel. In case when you want for example first 250 request to be served by goroutine1, than of course you cannot share a channel.
For channel is a good practice use it as input or output. And the simples thing how sender can sent, that he is finished is close the channel. Good article about that is https://blog.golang.org/pipelines
What is not mentiond in the question - is you need also another channel (or channels) or or any other communication primitive to get results. And here is the channel most interesting than to feeding.
What information should be sent - it should be sent, a bool after every doTest, or just know when everthing was done (it this case neither bool is not necessary just close a channel)?
If you prefer program at first fail. Than I would prefer use buffered shared channel to feed the numbers. Don't forget to close it, when all numbers will be feed.
And another unbuffered chan to let main thread know, that tests are done. It can be channel, there you only put the number, where test failed, or if you also want a positive result - channel of struct containing number and result, or any other informantion returned from doTest.
Very good article about channel is also http://dave.cheney.net/2014/03/19/channel-axioms
Each of your four goroutines can report a failure (by sending error and closing channel). But gotcha is what goroutines should do, when all numbers passed and feeding channel is closed. And about that is also nice article http://nathanleclaire.com/blog/2014/02/15/how-to-wait-for-all-goroutines-to-finish-executing-before-continuing/

Need help to understand this weird bahaviour of go routines

I've following code using go routines:
package main
import (
"fmt"
"time"
)
func thread_1(i int) {
time.Sleep(time.Second * 1)
fmt.Println("thread_1: i: ",i)
}
func thread_2(i int) {
time.Sleep(time.Second * 1)
fmt.Println("thread_2: i: ",i)
}
func main() {
for i := 0; i < 100; i++ {
go thread_1(i)
go thread_2(i)
}
var input string
fmt.Scanln(&input)
}
I was expecting each go routine would wait for a second and then print its value of i.
However, both the go routines waited for 1 second each and printed all the values of i at once.
Another question in the same context is:
I've a server-client application where server spawns a receiver go routine to receive data from client one for each client connection. Then the receiver go routine spawns a worker go routine called processor to process the data. There could be multiple receiver go routines and hence multiple processor go routines.
In such a case some receiver and processor go routines may hog indefinitely.
Please help me understand this behaviour.
You span 100 goroutines running thread_1 and 100 goroutines running thread_2 in one big batch. Each of these 200 goroutines sleeps for one second and then prints and ends. So yes, the behavior is to be expected: 200 goroutines sleeping each 1 second in parallel and then 200 goroutines printing in parallel.
(And I do not understand your second question)

How does make(chan bool) behave differently from make(chan bool, 1)?

My question arises from trying to read a channel, if I can, or write it, if I can, using a select statement.
I know that channels specified like make(chan bool, 1) are buffered, and part of my question is what is the difference between that, and make(chan bool) -- which this page says is the same thing as make(chan bool, 0) --- what is the point of a channel that can fit 0 values in it?
See playground A:
chanFoo := make(chan bool)
for i := 0; i < 5; i++ {
select {
case <-chanFoo:
fmt.Println("Read")
case chanFoo <- true:
fmt.Println("Write")
default:
fmt.Println("Neither")
}
}
A output:
Neither
Neither
Neither
Neither
Neither
(Removing the default case results in a deadlock!!)
Now see playground B:
chanFoo := make(chan bool, 1) // the only difference is the buffer size of 1
for i := 0; i < 5; i++ {
select {
case <-chanFoo:
fmt.Println("Read")
case chanFoo <- true:
fmt.Println("Write")
default:
fmt.Println("Neither")
}
}
B output:
Write
Read
Write
Read
Write
In my case, B output is what I want. What good are unbuffered channels? All the examples I see on golang.org appear to use them to send one signal/value at a time (which is all I need) -- but as in playground A, the channel never gets read or written. What am I missing here in my understanding of channels?
what is the point of a channel that can fit 0 values in it
First I want to point out that the second parameter here means buffer size, so that is simply a channel without buffers (un-buffered channel).
Actually that's the reason why your problem is generated. Un-buffered channels are only writable when there's someone blocking to read from it, which means you shall have some coroutines to work with -- instead of this single one.
Also see The Go Memory Model:
A receive from an unbuffered channel happens before the send on that channel completes.
An unbuffered channel means that read and writes from and to the channel are blocking.
In a select statement:
the read would work if some other goroutine was currently blocked in writing to the channel
the write would work if some other goroutine was currently blocked in reading to the channel
otherwise the default case is executed, which happens in your case A.
Unbuffered channels (created without capacity) will block the sender until somebody can read from them, so to make it work as you expect, you should use two goroutines to avoid the deadlock in the same thread.
For example, with this code: http://play.golang.org/p/KWJ1gbdSqf
It also includes a mechanism for the main func to detect when both goroutines have finished.
package main
import "fmt"
func send(out, finish chan bool) {
for i := 0; i < 5; i++ {
out <- true
fmt.Println("Write")
}
finish <- true
close(out)
}
func recv(in, finish chan bool) {
for _ = range in {
fmt.Println("Read")
}
finish <- true
}
func main() {
chanFoo := make(chan bool)
chanfinish := make(chan bool)
go send(chanFoo, chanfinish)
go recv(chanFoo, chanfinish)
<-chanfinish
<-chanfinish
}
It won't show the alternate Read and Write, because as soon as send writes to the channel, it is blocked, before it can print the "Write", so the execution moves to recv that receives the channel and prints "Read". It will try to read the channel again but it will be blocked and the execution moves to send. Now send can write the first "Write", send to the channel (without blocking because now there is a receiver ready) and write the second "Write". In any case, this is not deterministic and the scheduler may move the execution at any point to any other running goroutine (at least in the latest 1.2 release).

Resources