Is there any API to let the main goroutine sleep forever?
In other words, I want my project always run except when I stop it.
"Sleeping"
You can use numerous constructs that block forever without "eating" up your CPU.
For example a select without any case (and no default):
select{}
Or receiving from a channel where nobody sends anything:
<-make(chan int)
Or receiving from a nil channel also blocks forever:
<-(chan int)(nil)
Or sending on a nil channel also blocks forever:
(chan int)(nil) <- 0
Or locking an already locked sync.Mutex:
mu := sync.Mutex{}
mu.Lock()
mu.Lock()
Quitting
If you do want to provide a way to quit, a simple channel can do it. Provide a quit channel, and receive from it. When you want to quit, close the quit channel as "a receive operation on a closed channel can always proceed immediately, yielding the element type's zero value after any previously sent values have been received".
var quit = make(chan struct{})
func main() {
// Startup code...
// Then blocking (waiting for quit signal):
<-quit
}
// And in another goroutine if you want to quit:
close(quit)
Note that issuing a close(quit) may terminate your app at any time. Quoting from Spec: Program execution:
Program execution begins by initializing the main package and then invoking the function main. When that function invocation returns, the program exits. It does not wait for other (non-main) goroutines to complete.
When close(quit) is executed, the last statement of our main() function can proceed which means the main goroutine can return, so the program exits.
Sleeping without blocking
The above constructs block the goroutine, so if you don't have other goroutines running, that will cause a deadlock.
If you don't want to block the main goroutine but you just don't want it to end, you may use a time.Sleep() with a sufficiently large duration. The max duration value is
const maxDuration time.Duration = 1<<63 - 1
which is approximately 292 years.
time.Sleep(time.Duration(1<<63 - 1))
If you fear your app will run longer than 292 years, put the above sleep in an endless loop:
for {
time.Sleep(time.Duration(1<<63 - 1))
}
It depends on use cases to choose what kind of sleep you want.
#icza provides a good and simple solution for literally sleeping forever, but I want to give you some more sweets if you want your system could shutdown gracefully.
You could do something like this:
func mainloop() {
exitSignal := make(chan os.Signal)
signal.Notify(exitSignal, syscall.SIGINT, syscall.SIGTERM)
<-exitSignal
systemTeardown()
}
And in your main:
func main() {
systemStart()
mainloop()
}
In this way, you could not only ask your main to sleep forever, but you could do some graceful shutdown stuff after your code receives INT or TERM signal from OS, like ctrl+C or kill.
Another solution to block a goroutine. This solution prevents Go-Runtime to complain about the deadlock:
import "time"
func main() {
for {
time.Sleep(1138800 * time.Hour)
}
}
Related
I would like to cancel on demand a running command, for this, I am trying, exec.CommandContext, currently trying this:
https://play.golang.org/p/0JTD9HKvyad
package main
import (
"context"
"log"
"os/exec"
"time"
)
func Run(quit chan struct{}) {
ctx, cancel := context.WithCancel(context.Background())
cmd := exec.CommandContext(ctx, "sleep", "300")
err := cmd.Start()
if err != nil {
log.Fatal(err)
}
go func() {
log.Println("waiting cmd to exit")
err := cmd.Wait()
if err != nil {
log.Println(err)
}
}()
go func() {
select {
case <-quit:
log.Println("calling ctx cancel")
cancel()
}
}()
}
func main() {
ch := make(chan struct{})
Run(ch)
select {
case <-time.After(3 * time.Second):
log.Println("closing via ctx")
ch <- struct{}{}
}
}
The problem that I am facing is that the cancel() is called but the process is not being killed, my guess is that the main thread exit first and don't wait for the cancel() to properly terminate the command, mainly because If I use a time.Sleep(time.Second) at the end of the main function it exits/kills the running command.
Any idea about how could I wait to ensure that the command has been killed before exiting not using a sleep? could the cancel() be used in a channel after successfully has killed the command?
In a try to use a single goroutine I tried with this: https://play.golang.org/p/r7IuEtSM-gL but the cmd.Wait() seems to be blocking all the time the select and was not available to call the cancel()
In Go, the program will stop if the end of the main method (in the main package) is reached. This behavior is described in the Go language specification under a section on program execution (emphasis my own):
Program execution begins by initializing the main package and then invoking the function main. When that function invocation returns, the program exits. It does not wait for other (non-main) goroutines to complete.
Defects
I will consider each of your examples and their associated control flow defects. You will find links to the Go playground below, but the code in these examples will not execute in the restrictive playground sandbox as the sleep executable cannot be found. Copy and paste to your own environment for testing.
Multiple goroutine example
case <-time.After(3 * time.Second):
log.Println("closing via ctx")
ch <- struct{}{}
After the timer fires and you signal to the goroutine it is time to kill the child and stop work, there is nothing to cause the main method to block and wait for this to complete, so it returns. In accordance with the language spec, the program exits.
The scheduler may fire after the channel transmit, so there may be a race may between main exiting and the other goroutines waking up to receive from ch. However, it is unsafe to assume any particular interleaving of behavior – and, for practical purposes, unlikely that any useful work will happen before main quits. The sleep child process will be orphaned; on Unix systems, the operating system will normally re-parent the process onto the init process.
Single goroutine example
Here, you have the opposite problem: main does not return, so the child process is not killed. This situation is only resolved when the child process exits (after 5 minutes). This occurs because:
The call to cmd.Wait in the Run method is a blocking call (docs). The select statement is blocked waiting for cmd.Wait to return an error value, so cannot receive from the quit channel.
The quit channel (declared as ch in main) is an unbuffered channel. Send operations on unbuffered channels will block until a receiver is ready to receive the data. From the language spec on channels (again, emphasis my own):
The capacity, in number of elements, sets the size of the buffer in the channel. If the capacity is zero or absent, the channel is unbuffered and communication succeeds only when both a sender and receiver are ready.
As Run is blocked in cmd.Wait, there is no ready receiver to receive the value transmitted on the channel by the ch <- struct{}{} statement in the main method. main blocks waiting to transmit this data, which prevents the process returning.
We can demonstrate both issues with minor code tweaks.
cmd.Wait is blocking
To expose the blocking nature of cmd.Wait, declare the following function and use it in place of the Wait call. This function is a wrapper with the same behavior as cmd.Wait, but additional side-effects to print what is happening to STDOUT. (Playground link):
func waitOn(cmd *exec.Cmd) error {
fmt.Printf("Waiting on command %p\n", cmd)
err := cmd.Wait()
fmt.Printf("Returning from waitOn %p\n", cmd)
return err
}
// Change the select statement call to cmd.Wait to use the wrapper
case e <- waitOn(cmd):
Upon running this modified program, you will observe the output Waiting on command <pointer> to the console. After the timers fire, you will observe the output calling ctx cancel, but no corresponding Returning from waitOn <pointer> text. This will only occur when the child process returns, which you can observe quickly by reducing the sleep duration to a smaller number of seconds (I chose 5 seconds).
Send on the quit channel, ch, blocks
main cannot return because the signal channel used to propagate the quit request is unbuffered and there is no corresponding listener. By changing the line:
ch := make(chan struct{})
to
ch := make(chan struct{}, 1)
the send on the channel in main will proceed (to the channel's buffer) and main will quit – the same behavior as the multiple goroutine example. However, this implementation is still broken: the value will not be read from the channel's buffer to actually start stopping the child process before main returns, so the child process will still be orphaned.
Fixed version
I have produced a fixed version for you, code below. There are also some stylistic improvements to convert your example to more idiomatic go:
Indirection via a channel to signal when it is time to stop is unnecessary. Instead, we can avoid declaring a channel by hoisting declaration of the context and cancellation function to the main method. The context can be cancelled directly at the appropriate time.
I have retained the separate Run function to demonstrate passing the context in this way, but in many cases, its logic could be embedded into the main method, with a goroutine spawned to perform the cmd.Wait blocking call.
The select statement in the main method is unnecessary as it only has one case statement.
sync.WaitGroup is introduced to explicitly solve the problem of main exiting before the child process (waited on in a separate goroutine) has been killed. The wait group implements a counter; the call to Wait blocks until all goroutines have finished working and called Done.
package main
import (
"context"
"log"
"os/exec"
"sync"
"time"
)
func Run(ctx context.Context) {
cmd := exec.CommandContext(ctx, "sleep", "300")
err := cmd.Start()
if err != nil {
// Run could also return this error and push the program
// termination decision to the `main` method.
log.Fatal(err)
}
err = cmd.Wait()
if err != nil {
log.Println("waiting on cmd:", err)
}
}
func main() {
var wg sync.WaitGroup
ctx, cancel := context.WithCancel(context.Background())
// Increment the WaitGroup synchronously in the main method, to avoid
// racing with the goroutine starting.
wg.Add(1)
go func() {
Run(ctx)
// Signal the goroutine has completed
wg.Done()
}()
<-time.After(3 * time.Second)
log.Println("closing via ctx")
cancel()
// Wait for the child goroutine to finish, which will only occur when
// the child process has stopped and the call to cmd.Wait has returned.
// This prevents main() exiting prematurely.
wg.Wait()
}
(Playground link)
Here is a simple example code about unbuffered channels:
ch01 := make(chan string)
go func() {
fmt.Println("We are in the sub goroutine")
fmt.Println(<-ch01)
}()
fmt.Println("We are in the main goroutine")
ch01 <- "Hello"
The result I got:
We are in the main goroutine
We are in the sub goroutine
Hello
Go playground:
https://play.golang.org/p/rFWQbwXRzGw
From my understanding, the send operation blocked the main goroutine, until the sub goroutine executed a receive operation on channel ch01. Then the program exited.
After placing the sub goroutine after the send operation like that:
fmt.Println("We are in the main goroutine")
ch01 <- "Hello"
go func() {
fmt.Println("We are in the sub goroutine")
fmt.Println(<-ch01)
}()
A deadlock occurred:
We are in the main goroutine
fatal error: all goroutines are asleep - deadlock!
go playground
https://play.golang.org/p/DmRUiBG4UmZ
What happened this time? Did that mean after ch01 <- "Hello" the main goroutine was immediately blocked so that the sub goroutine had no chance to run? If it is true, how should I understand the result of the first code example?(At first in main goroutine, then in sub goroutine).
An unbuffered channel blocks on send until a receiver is ready to read. In your first example a reader is set up first, so when the send occurs it can be sent immediately.
In your second example, the send happens before a receiver is ready so the send blocks and the program deadlocks.
You could fix the second example by making a buffered channel, but there is a chance you won't ever see the output from the goroutine as the program may exit (the main goroutine) before the output buffer is flushed. The goroutine may not even run as main exits before it can be scheduled.
First of all, go-routines run concurrently. In 1st example, the sub-goroutine has already started, but in 2nd example, the go-routine hasn't started yet when the send operation appears.
Think about line by line.
In 1st example, the sub-goroutine has started concurrently before the send operation appears on the main go-routine. As a result, when the the send operation happens, there is already an receiver (sub-goroutine) exists.
If you tweak the 1st example,
package main
import (
"fmt"
"time"
)
func main() {
ch01 := make(chan string)
go func() {
fmt.Println("We are in the sub goroutine")
fmt.Println(<-ch01)
}()
// wait for start of sub-routine
time.Sleep(time.Second * 2)
fmt.Println("We are in the main goroutine")
ch01 <- "Hello"
// wait for the routine to receive and print the string
time.Sleep(time.Second * 2)
}
The output will be
We are in the sub goroutine
We are in the main goroutine
Hello
So, you can see that the sub-goroutine has already started.and it is waiting to receive on channel. When the main goroutine send string in channel, the sub-goroutine resumes and receives the signal.
But in 2nd example, The program has stuck in main go routine send operation, and the sub go routine has not started yet and will not start, because the program has not got that line yet. so there is no other receiver to receive the signal. So the program stuck in deadlock.
For unbuffered channels the go routine is blocked until there is no one to receive it. First there should be a go routine to receive the value from the channel and then a value to the channel is send. For the example when we are sending a value to channel it is required to create a buffered channel so that the value is saved into buffered until there is no one to receive it like this will work.
package main
import (
"fmt"
"time"
)
func main() {
ch01 := make(chan string, 10)
ch01 <- "Hello"
go func() {
fmt.Println("We are in the sub goroutine")
fmt.Println(<-ch01)
}()
fmt.Println("We are in the main goroutine")
time.Sleep(1 * time.Second)
}
Playground
Did that mean after ch01 <- "Hello" the main goroutine was immediately
blocked so that the sub goroutine had no chance to run? If it is true,
how should I understand the result of the first code example?(At first
in main goroutine, then in sub goroutine).
It's true. You understand things write. Order of evaluation of spawned goroutines unspecified and can only be controlled with sync tools(channels, mutexes). Sub goroutine in first example may as well Print() first in another environment. It's just unspecified.
package main
import (
"fmt"
"time"
)
func main() {
p := producer()
for c := range p {
fmt.Println(c)
}
}
func producer() <-chan string {
ch := make(chan string)
go func() {
for i := 0; i < 5; i++ {
ch <- fmt.Sprint("hello", i)
time.Sleep(1 * time.Second)
}
// commented the below to show the issue
// close(ch)
}()
return ch
}
Running the above code will print 5 messages and then give a "all go routines are a sleep - deadlock error". I understand that if I close the channel the error is gone.
The thing I would like to understand is how does go runtime know that the code will be waiting infinitely on the channel and that there is nothing else that will be sending data into the channel.
Now if I add an additional go routine to the main() function.. it does not throw any error and keeps waiting on the channel.
go func() {
for {
time.Sleep(2 * time.Millisecond)
}
}()
So does this mean.. the go runtime is just looking for presence of a running go routine that could potentially send data into the channel and hence not throwing the deadlock error ?
If you want some more insight into how Go implements the deadlock detection, have a look at the place in the code that throws the "all goroutines are asleep - deadlock!": https://github.com/golang/go/blob/master/src/runtime/proc.go#L3751
It looks like the Go runtime keeps some fairly simple accounting on how many goroutines there are, how many are idle, and how many are sleeping for locks (not sure which one sleep on channel I/O will increment). At any given time (serialized with the rest of the runtime), it just does some arithmetic and checks if all - idle - locked > 0... if so, then the program could still make progress... if it's 0, then you're definitely deadlocked.
It's possible you could introduce a livelock by preventing a goroutine from sleeping via an infinite loop (like what you did in your experiment, and apparently sleep for timers isn't treated the same by the runtime). The runtime wouldn't be able to detect a deadlock in that case, and run forever.
Furthermore, I'm not sure when exactly the runtime checks for deadlocks- further inspection of who calls that checkdead() may yield some insight there, if you're interested.
DISCLAIMER- I'm not a Go core developer, I just play one on TV :-)
The runtime panics with the "all go routines are a sleep - deadlock error" error when all goroutines are blocked on channel and mutex operations.
The sleeping goroutine does not block on one of these operations. There is no deadlock and therefore no panic.
The following code runs perfectly fine:
package main
import (
"fmt"
)
func my_func(c chan int){
fmt.Println(<-c)
}
func main(){
c := make(chan int)
go my_func(c)
c<-3
}
playgound_1
However if I change
c<-3
to
time.Sleep(time.Second)
c<-3
playground_2
My code does not execute.
My gut feeling is that somehow main returns before the my_func finishes executing, but it seems like adding a pause should not have any effect. I am totally lost on this simple example, what's going on here?
When the main function ends, the program ends with it. It does not wait for other goroutines to finish.
Quoting from the Go Language Specification: Program Execution:
Program execution begins by initializing the main package and then invoking the function main. When that function invocation returns, the program exits. It does not wait for other (non-main) goroutines to complete.
So simply when your main function succeeds by sending the value on the channel, the program might terminate immediately, before the other goroutine has the chance to print the received value to the console.
If you want to make sure the value gets printed to the console, you have to synchronize it with the event of exiting from the main function:
Example with a "done" channel (try it on Go Playground):
func my_func(c, done chan int) {
fmt.Println(<-c)
done <- 1
}
func main() {
c := make(chan int)
done := make(chan int)
go my_func(c, done)
time.Sleep(time.Second)
c <- 3
<-done
}
Since done is also an unbuffered channel, receiving from it at the end of the main function must wait the sending of a value on the done channel, which happens after the value sent on channel c has been received and printed to the console.
Explanation for the seemingly non-deterministic runs:
Goroutines may or may not be executed parallel at the same time. Synchronization ensures that certain events happen before other events. That is the only guarantee you get, and the only thing you should rely on.
2 examples of this Happens Before:
The go statement that starts a new goroutine happens before the goroutine's execution begins.
A send on a channel happens before the corresponding receive from that channel completes.
For more details read The Go Memory Model.
Back to your example:
A receive from an unbuffered channel happens before the send on that channel completes.
So the only guarantee you get is that the goroutine that runs my_func() will receive the value from channel c sent from main(). But once the value is received, the main function may continue but since there is no more statements after the send, it simply ends - along with the program. Whether the non-main goroutine will have time or chance to print it with fmt.Println() is not defined.
In the code below, iterations are runned two times.
Is it possible that "test2 <- true" is runned at the moment which is just between the first iteration and the second iteration?
I mean, is there a change to send true to "test2" when the first iteration is ended and the second iteration is not started?
package main
import "log"
import "time"
func main() {
test := make(chan bool, 1)
test2 := make(chan bool, 1)
go func() {
for {
select {
case <-test:
log.Println("test")
case <-test2:
log.Println("test2")
}
}
}()
test <- true
time.Sleep(1)
test2 <- true
time.Sleep(1)
}
Yes. Since your channels are buffered and can hold 1 value. the main execution flow can continue without your anonymous goroutine reading the value you send to the test channel, and it can send a value on the test2 channel before the goroutine wakes up and read the value on the test channel.
This is unlikely to happen, since you have a time.Sleep() call there to normally give time for the goroutine to execute, but there's no telling what'll happen in a corner case of your machine being very busy, being power suspended at an (un)lucky time or other things you didn't think about.
If your test channel was unbuffered, the test <- true statement would block until your goroutine received the value, and there would at least be no possibility for the goroutine to receive from test2 before receiving anything from the test channel.
To add to nos' answer, you can simulate that case (where "test2 <- true" is run at the moment which is just between the first iteration and the second iteration") easily enough by making your first message reception (case <- test) wait one second.
case <-test:
log.Println("test")
time.Sleep(1 * time.Second)
By the time the anonymous goroutine wakes up, main() has already sent its two messages to the two buffered channel (buffer means non-blokcing for one message), and exited.
If main() exits, everything else, including the goroutine which was busy sleeping, stops.
See play.golang.org: the output would be:
2009/11/10 23:00:00 test
You wouldn't have the time to see test2.
In order to make sure your goroutine can process both message, you need:
main() to wait for said goroutine to finish. That is where the sync package comes into play, using (for instance, this isn't the only solution) a WaitGroup directive.
var wg sync.WaitGroup
wg.Add(1)
go func() {
// Decrement the counter when the goroutine completes.
defer wg.Done()
...
}
... // end of main():
// Wait for goroutine to complete.
wg.Wait()
the goroutine to actually exit at some time (instead of being stuck in the for loop forever). See "In Go, does a break statement break from a switch/select?"
loop: <==========
for {
select {
case <-test:
log.Println("test")
time.Sleep(1 * time.Second)
case <-test2:
log.Println("test2")
break loop <==========
}
}
See play.golang.org: the message test2 is sent while the goroutine is sleeping after test1, but main() will wait (wg.Wait()), and the goroutine will have its chance to read and print test2 one second later.
The output is:
2009/11/10 23:00:00 test
2009/11/10 23:00:01 test2 // one second later