Golang - Why use done channel - go

It's one of example code of Golang. But cannot understand why 'done' channel need in this case.
https://gobyexample.com/closing-channels
There is no reason to sent true to done channel. We can know jobs channel is done when "sent all jobs" message is printed, isn't it?
I deleted code that relative to done channel and result is still same.
https://play.golang.org/p/xOmzobFpTQ

No the result is not the same:
Your main goroutine exits before your received job goroutine in many situations (e.g. different CPU loads, and it is nondeterministic and system dependent behavior), so in that way you cannot guarantee that all jobs received, e.g. just Add
time.Sleep(500)
before
fmt.Println("received job", j)
to see this, try it on The Go Playground:
// _Closing_ a channel indicates that no more values
// will be sent on it. This can be useful to communicate
// completion to the channel's receivers.
package main
import (
"fmt"
"time"
)
// In this example we'll use a `jobs` channel to
// communicate work to be done from the `main()` goroutine
// to a worker goroutine. When we have no more jobs for
// the worker we'll `close` the `jobs` channel.
func main() {
jobs := make(chan int, 5)
//done := make(chan bool)
// Here's the worker goroutine. It repeatedly receives
// from `jobs` with `j, more := <-jobs`. In this
// special 2-value form of receive, the `more` value
// will be `false` if `jobs` has been `close`d and all
// values in the channel have already been received.
// We use this to notify on `done` when we've worked
// all our jobs.
go func() {
for {
j, more := <-jobs
if more {
time.Sleep(500)
fmt.Println("received job", j)
} else {
fmt.Println("received all jobs")
//done <- true
return
}
}
}()
// This sends 3 jobs to the worker over the `jobs`
// channel, then closes it.
for j := 1; j <= 3; j++ {
jobs <- j
fmt.Println("sent job", j)
}
close(jobs)
fmt.Println("sent all jobs")
// We await the worker using the
// [synchronization](channel-synchronization) approach
// we saw earlier.
//<-done
}
output:
sent job 1
sent job 2
sent job 3
sent all jobs
instead of:
sent job 1
received job 1
received job 2
sent job 2
sent job 3
received job 3
received all jobs
sent all jobs
See:
Goroutine does not execute if time.Sleep included
Why is time.sleep required to run certain goroutines?
Weird channel behavior in go

TL;DR: There is a race condition---You got lucky.
If you didn't have the done channel, then the output of the program is non-deterministic.
Depending on the thread execution order, the main thread may exit before the goroutine finished its processing thereby causing the goroutine to be killed mid way.
By forcing the main thread to read from the done channel, we're forcing the main thread to wait until there is some data to be consumed in the done channel. This gives us a neat synchronization mechanism wherein the goroutine informs the main thread that it is done by writing to the done channel. This in turn causes the main thread's blocking <- done to finish and causes the program to terminate.

I think the accepted answer did not go into the exact reasons why.
The go language belongs to the procedural paradigm, meaning that every instruction is executed linearly. When a go routine is forked from the main go routine, it goes off on its own little adventure, leaving the main thread to return.
The buffered channel has a capacity of 5, which means it will not block until the buffer is full. It will also block if it is empty (a channel with zero capacity is inherently unbuffered).
Since there are only 4 iterations (0 to <=3), the read operation will not block.
By instructing the main thread to read from the done channel, we're forcing the main thread to wait until there is some data to be consumed in the done channel. When the iteration is over, the else branch is executed and the write operation done <- true causes the release of the <- done read operation in the main thread. The read operation waits to pull the now inserted value off of done.
After reading from done, the main Go routine is no longer blocked, hence terminating successfully.

Sent doesn't mean the job is done, when the job take long time to
finish.
The jobs channel is buffered, so even the job is sent, the job may
not even received by the worker yet.

Related

exiting multiple go routines waiting on an unbuffered channel

I was trying to exit multiple Goroutines Simultaneously.
According to https://www.godesignpatterns.com/2014/04/exiting-multiple-goroutines-simultaneously.html
there is a well defined way to do it.
Another approach i see is the following
package main
import (
"fmt"
"time"
)
func main() {
var inCh chan int = make(chan int, 100)
var exit chan bool = make(chan bool)
for i := 0; i < 20; i++ {
go func(instance int) {
fmt.Println("In go routine ", instance)
for {
select {
case <-exit:
fmt.Println("Exit received from ", instance)
exit <- true
return
case value := <-inCh:
fmt.Println("Value=", value)
}
}
}(i)
}
time.Sleep(1 * time.Second)
exit <- true
<-exit // Final exit
fmt.Println("Final exit")
}
But I am confused and i really don't get it why the unbuffered channel at the end is executed as a last statement.
In reality I have 20 go routines listening for exit channel. Randomly one will receive it and send it to another.
Why always the reception within go routines is taking place and only when all of them are finished the channel reception with comment "// Final Exit" will execute ?
I will really appreciate if someone can give me an explanation.
Use close() for cancelation as shown in the linked article.
The code in question is not guaranteed to work. Here's a scenario where it fails:
One goroutine is ready to receive from exit. All other goroutines busy somewhere else.
The value sent by main is received by the ready goroutine.
That goroutine sends a value to exit that is received by main().
The other goroutines do not exit because no more values are sent to exit. See this playground example that uses time.Seep to induce the problem scenario.
Why always the reception within go routines is taking place and only when all of them are finished the channel reception with comment "// Final Exit" will execute ?
The program executes as if the channel maintains an ordered queue of waiting goroutines, but there's nothing in the specification that guarantees that behavior. Even if the channel has an ordered queue, the program can encounter the scenario above if a goroutine is doing something other than waiting on receive from exit.
If you notice the output of you program
In go routine 6
In go routine 0
In go routine 7
.
.
Exit received from 6
Exit received from 0
Exit received from 7
.
.
Final exit
They get called in the same( or almost the same) order of how they were started. If none of your Go routines are busy the first one registered will be used. This is just an implementation of the runtime and I would not count on this behavior.
Your final exit was the last channel to be listened on so it is used last.
If you remove the time.Sleep after your loop your final exit will get called almost immediately and most of your go routines will not receive the exit signal
Output with out time.Sleep (will very between runs)
In go routine 0
Exit received from 0
In go routine 1
In go routine 2
In go routine 3
In go routine 4
In go routine 5
In go routine 6
In go routine 7
In go routine 14
In go routine 15
In go routine 16
In go routine 17
In go routine 18
In go routine 19
Final exit
Consider this slight modification.
package main
import (
"fmt"
)
func main() {
var exit chan int = make(chan int)
var workers = 20
for i := 0; i < workers; i++ {
go func(instance int) {
fmt.Println("In go routine ", instance)
for {
select {
case i := <-exit:
fmt.Println("Exit", i, "received from ", instance)
exit <- i-1
return
}
}
}(i)
}
exit <- workers
fmt.Println("Final exit:", <-exit)
}
Here, I've done 3 things: First, I removed the unused channel, for brevity. Second, I removed the sleep. third, I changed the exit channel to an int channel that is decremented by every pass. If I pass the number of workers in, any value other than 0 from the "Final" message indicates dropped workers.
Here's one example run:
% go run t.go
In go routine 8
In go routine 5
In go routine 0
In go routine 2
Exit 20 received from 8
Exit 19 received from 5
Final exit: 18
In go routine 13
When main calls time.Sleep it doesn't get scheduled until the sleep is over. The other goroutines all have this time to set up their channel readers. I can only assume, because I can't find it written anywhere, that channel readers are likely to be queued in roughly chronological error - thus, the sleep guarantees that main's reader is the last.
If this is consistent behavior, it certainly isn't reliable.
See Multiple goroutines listening on one channel for many more thoughts on this.

How can I read from a channel after my GoRoutines have finished running?

I currently have two functions pushNodes(node) and updateNodes(node). In the pushNodes function, I am pushing values through a channel that are to be used in updateNodes. In order to have accurate channel values saved, I need all pushNodes go routines to finish before starting updateNodes(). How can I still access the channel values after the GoRoutines have finished executing?
I continuously get "fatal error: all goroutines are asleep - deadlock!". Please let me know how I can get these values from the channel. Is there a better/alternate way to do this?
//pushNodes is a function that will push the nodes values
func pushNodes(node Node) {
defer wg.Done()
fmt.Printf("Pushing: %d \n", node.number)
//Choose a random peer node
var randomnode int = rand.Intn(totalnodes)
for randomnode == node.number {
rand.Seed(time.Now().UnixNano())
randomnode = rand.Intn(totalnodes)
}
//If the current node is infected, send values through the channel
if node.infected {
sentchanneldata := ChannelData{infected: true, message: node.message}
allnodes[randomnode].channel <- sentchanneldata
fmt.Printf("Node %d sent a value of %t and %s to node %d!\n", node.number, sentchanneldata.infected, sentchanneldata.message, allnodes[randomnode].number)
}
//updateNodes is a function that will update the nodes values
func updateNodes(node Node) {
defer wg.Done()
fmt.Printf("Updating: %d\n", node.number)
//get value through node channel
receivedchanneldata := <-node.channel
fmt.Printf("Node %d received a value of %t and %s!\n", node.number, receivedchanneldata.infected, receivedchanneldata.message)
// update value
if receivedchanneldata.infected == true {
node.infected = true
}
if receivedchanneldata.message != "" {
node.message = receivedchanneldata.message
}
fmt.Printf("Update successful!\n")
}
//Part of main functions
wg.Add(totalnodes)
for node := range allnodes {
go pushNodes(allnodes[node])
}
wg.Wait()
fmt.Println("Infect function done!")
wg.Add(totalnodes)
for node := range allnodes {
go updateNodes(allnodes[node])
}
wg.Wait()
How can I still access the channel values after the GoRoutines have finished executing?
A channel's existence, including any data that have been shoved into it, is independent of the goroutines that might read from or write to it, provided that at least one goroutine still exists that can read from and/or write to it. (Once all such goroutines are gone, the channel will—eventually—be GC'ed.)
Your code sample is unusable (as already noted) so we can't say precisely where you have gone wrong, but you'll get the kind of fatal message you report here:
fatal error: all goroutines are asleep - deadlock!
if you attempt to read from a channel in the last runnable goroutine, such that this goroutine goes to sleep to await a message on that channel, in such a way that the rest of the Go runtime can determine for certain that no currently-asleep goroutine will ever wake up and deliver a message on that channel. For instance, suppose you have 7 total goroutines running right as one of them reaches the following line of code:
msg = <-ch
where ch is an open channel with no data available right now. One of those 7 goroutines reaches this line and blocks ("goes to sleep"), waiting for one of the remaining six goroutines to do:
ch <- whatever
which would wake up that 7th goroutine. So now there are only 6 goroutines that can write on ch or close ch. If those six remaining goroutines also pass through the same line, one at a time or several or all at once, with none of them ever sending on the channel or closing it, those remaining goroutines will also block. When the last one of them blocks, the runtime will realize that the program is stuck, and panic.
If, however, only five of the remaining six goroutines block like this, and then the sixth one runs though a line reading:
close(ch)
that close operation will close the channel, causing all six of the "stuck asleep" goroutines to receive "end of data" represented by a zero-valued "fake" message msg. You can also use the two-valued form of receive:
msg, ok = <-ch
Here ok gets true if the channel isn't closed and msg contains a real message, but gets false if the channel is closed and msg now contains a zero-valued "fake" message.
Thus, you can either:
close the channel to indicate that you plan not to send anything else, or
carefully match up the number of "receive from channel" operations to the number of "send message on channel" operations.
The former is the norm with channels where there's no way to know in advance how many messages should be sent on the channel. It can still be used even if you do know. A typical construct for doing the close is:
ch := make(chan T) // for some type T
// do any other setup that is appropriate
var wg sync.WaitGroup
wg.add(N) // for some number N
// spin off some number of goroutines N, each of which may send
// any number of messages on the channel
for i := 0; i < N; i++ {
go doSomething(&wg, ch)
// in doSomething, call wg.Done() when done sending on ch
}
go func() {
wg.Wait() // wait for all N goroutines to finish
close(ch) // then, close the channel
}()
// Start function(s) that receive from the channel, either
// inline or in more goroutines here; have them finish when
// they see that the channel is closed.
This pattern relies on the ability to create an extra N+1'th goroutine—that's the anonymous function go func() { ... }() sequence—whose entire job in life is to wait for all the senders to say I am done sending. Each sender does that by calling wg.Done() once. That way, no sender has any special responsibility for closing the channel: they all just write and then announce "I'm done writing" when they are done writing. One goroutine has one special responsibility: it waits for all senders to have announced "I'm done writing", and then it closes the channel and exits, having finished its one job in life.
All receivers—whether that's one or many—now have an easy time knowing when nobody will ever send anything any more, because they see a closed channel at that point. So if most of the work is on the sending side, you can even use the main goroutine here with a simple for ... range ch loop.

Go project's main goroutine sleep forever?

Is there any API to let the main goroutine sleep forever?
In other words, I want my project always run except when I stop it.
"Sleeping"
You can use numerous constructs that block forever without "eating" up your CPU.
For example a select without any case (and no default):
select{}
Or receiving from a channel where nobody sends anything:
<-make(chan int)
Or receiving from a nil channel also blocks forever:
<-(chan int)(nil)
Or sending on a nil channel also blocks forever:
(chan int)(nil) <- 0
Or locking an already locked sync.Mutex:
mu := sync.Mutex{}
mu.Lock()
mu.Lock()
Quitting
If you do want to provide a way to quit, a simple channel can do it. Provide a quit channel, and receive from it. When you want to quit, close the quit channel as "a receive operation on a closed channel can always proceed immediately, yielding the element type's zero value after any previously sent values have been received".
var quit = make(chan struct{})
func main() {
// Startup code...
// Then blocking (waiting for quit signal):
<-quit
}
// And in another goroutine if you want to quit:
close(quit)
Note that issuing a close(quit) may terminate your app at any time. Quoting from Spec: Program execution:
Program execution begins by initializing the main package and then invoking the function main. When that function invocation returns, the program exits. It does not wait for other (non-main) goroutines to complete.
When close(quit) is executed, the last statement of our main() function can proceed which means the main goroutine can return, so the program exits.
Sleeping without blocking
The above constructs block the goroutine, so if you don't have other goroutines running, that will cause a deadlock.
If you don't want to block the main goroutine but you just don't want it to end, you may use a time.Sleep() with a sufficiently large duration. The max duration value is
const maxDuration time.Duration = 1<<63 - 1
which is approximately 292 years.
time.Sleep(time.Duration(1<<63 - 1))
If you fear your app will run longer than 292 years, put the above sleep in an endless loop:
for {
time.Sleep(time.Duration(1<<63 - 1))
}
It depends on use cases to choose what kind of sleep you want.
#icza provides a good and simple solution for literally sleeping forever, but I want to give you some more sweets if you want your system could shutdown gracefully.
You could do something like this:
func mainloop() {
exitSignal := make(chan os.Signal)
signal.Notify(exitSignal, syscall.SIGINT, syscall.SIGTERM)
<-exitSignal
systemTeardown()
}
And in your main:
func main() {
systemStart()
mainloop()
}
In this way, you could not only ask your main to sleep forever, but you could do some graceful shutdown stuff after your code receives INT or TERM signal from OS, like ctrl+C or kill.
Another solution to block a goroutine. This solution prevents Go-Runtime to complain about the deadlock:
import "time"
func main() {
for {
time.Sleep(1138800 * time.Hour)
}
}

How to perform concurrent downloads in Go

We have a process whereby users request files that we need to get from our source. This source isn't the most reliable so we implemented a queue using Amazon SQS. We put the download URL into the queue and then we poll it with a small app that we wrote in Go. This app simply retrieves the messages, downloads the file and then pushes it to S3 where we store it. Once all of this is complete it calls back a service which will email the user to let them know that the file is ready.
Originally I wrote this to create n channels and then attached 1 go-routine to each and had the go-routine in an infinite loop. This way I could ensure that I was only ever processing a fixed number of downloads at a time.
I realised that this isn't the way that channels are supposed to be used and, if I'm understanding correctly now, there should actually be one channel with n go-routines receiving on that channel. Each go-routine is in an infinite loop, waiting on a message and when it receives it will process the data, do everything that it's supposed to and when it's done it will wait on the next message. This allows me to ensure that I'm only ever processing n files at a time. I think this is the right way to do it. I believe this is fan-out, right?
What I don't need to do, is to merge these processes back together. Once the download is done it is calling back a remote service so that handles the remainder of the process. There is nothing else that the app needs to do.
OK, so some code:
func main() {
queue, err := ConnectToQueue() // This works fine...
if err != nil {
log.Fatalf("Could not connect to queue: %s\n", err)
}
msgChannel := make(chan sqs.Message, 10)
for i := 0; i < MAX_CONCURRENT_ROUTINES; i++ {
go processMessage(msgChannel, queue)
}
for {
response, _ := queue.ReceiveMessage(MAX_SQS_MESSAGES)
for _, m := range response.Messages {
msgChannel <- m
}
}
}
func processMessage(ch <-chan sqs.Message, queue *sqs.Queue) {
for {
m := <-ch
// Do something with message m
// Delete message from queue when we're done
queue.DeleteMessage(&m)
}
}
Am I anywhere close here? I have n running go-routines (where MAX_CONCURRENT_ROUTINES = n) and in the loop we will keep passing messages in to the single channel. Is this the right way to do it? Do I need to close anything or can I just leave this running indefinitely?
One thing that I'm noticing is that SQS is returning messages but once I've had 10 messages passed into processMessage() (10 being the size of the channel buffer) that no further messages are actually processed.
Thanks all
That looks fine. A few notes:
You can limit the work parallelism by means other than limiting the number of worker routines you spawn. For example you can create a goroutine for every message received, and then have the spawned goroutine wait for a semaphore that limits the parallelism. Of course there are tradeoffs, but you aren't limited to just the way you've described.
sem := make(chan struct{}, n)
work := func(m sqs.Message) {
sem <- struct{}{} // When there's room we can proceed
// do the work
<-sem // Free room in the channel
}()
for _, m := range queue.ReceiveMessage(MAX_SQS_MESSAGES) {
for _, m0 := range m {
go work(m0)
}
}
The limit of only 10 messages being processed is being caused elsewhere in your stack. Possibly you're seeing a race where the first 10 fill the channel, and then the work isn't completing, or perhaps you're accidentally returning from the worker routines. If your workers are persistent per the model you've described, you'll want to be certain that they don't return.
It's not clear if you want the process to return after you've processed some number of messages. If you do want this process to exit, you'll need to wait for all the workers to finish their current tasks, and probably signal them to return afterwards. Take a look at sync.WaitGroup for synchronizing their completion, and having another channel to signal that there's no more work, or close msgChannel, and handle that in your workers. (Take a look at the 2-tuple return channel receive expression.)

Goroutine does not execute if time.Sleep included

The following code runs perfectly fine:
package main
import (
"fmt"
)
func my_func(c chan int){
fmt.Println(<-c)
}
func main(){
c := make(chan int)
go my_func(c)
c<-3
}
playgound_1
However if I change
c<-3
to
time.Sleep(time.Second)
c<-3
playground_2
My code does not execute.
My gut feeling is that somehow main returns before the my_func finishes executing, but it seems like adding a pause should not have any effect. I am totally lost on this simple example, what's going on here?
When the main function ends, the program ends with it. It does not wait for other goroutines to finish.
Quoting from the Go Language Specification: Program Execution:
Program execution begins by initializing the main package and then invoking the function main. When that function invocation returns, the program exits. It does not wait for other (non-main) goroutines to complete.
So simply when your main function succeeds by sending the value on the channel, the program might terminate immediately, before the other goroutine has the chance to print the received value to the console.
If you want to make sure the value gets printed to the console, you have to synchronize it with the event of exiting from the main function:
Example with a "done" channel (try it on Go Playground):
func my_func(c, done chan int) {
fmt.Println(<-c)
done <- 1
}
func main() {
c := make(chan int)
done := make(chan int)
go my_func(c, done)
time.Sleep(time.Second)
c <- 3
<-done
}
Since done is also an unbuffered channel, receiving from it at the end of the main function must wait the sending of a value on the done channel, which happens after the value sent on channel c has been received and printed to the console.
Explanation for the seemingly non-deterministic runs:
Goroutines may or may not be executed parallel at the same time. Synchronization ensures that certain events happen before other events. That is the only guarantee you get, and the only thing you should rely on.
2 examples of this Happens Before:
The go statement that starts a new goroutine happens before the goroutine's execution begins.
A send on a channel happens before the corresponding receive from that channel completes.
For more details read The Go Memory Model.
Back to your example:
A receive from an unbuffered channel happens before the send on that channel completes.
So the only guarantee you get is that the goroutine that runs my_func() will receive the value from channel c sent from main(). But once the value is received, the main function may continue but since there is no more statements after the send, it simply ends - along with the program. Whether the non-main goroutine will have time or chance to print it with fmt.Println() is not defined.

Resources