Why a go-routine block on channel is considered as deadlock? - go

As per the definition here, deadlock is related to resource contention.
In an operating system, a deadlock occurs when a process or thread enters a waiting state because a requested system resource is held by another waiting process, which in turn is waiting for another resource held by another waiting process. If a process is unable to change its state indefinitely because the resources requested by it are being used by another waiting process, then the system is said to be in a deadlock.
In the below code:
package main
import "fmt"
func main() {
c := make(chan string)
c <- "John"
fmt.Println("main() stopped")
}
main() go-routine blocks until any other go-routine(no such) reads the same data from that channel.
but the output shows:
$ bin/cs61a
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [chan send]:
main.main()
/home/user/../myhub/cs61a/Main.go:8 +0x54
$
edit:
For the point: "the main goroutine–which is blocked, hence all goroutines are blocked, hence it's a deadlock." in the below code, non-main goroutine is also blocked on channel, aren't all goroutines supposed to get blocked?
package main
import (
"fmt"
"time"
)
func makeRandom(randoms chan int) {
var ch chan int
fmt.Printf("print 1\n")
<-ch
fmt.Printf("print 2\n")
}
func main() {
randoms := make(chan int)
go makeRandom(randoms)
}
Edit 2:
For your point in the answer: "not all your goroutines are blocked so it's not a deadlock". In the below code, only main() goroutine is blocked, but not worker():
package main
import (
"fmt"
)
func worker() {
fmt.Printf("some work\n")
}
func main() {
ch := make(chan int)
go worker()
<-ch
}
and the output says deadlock:
$ bin/cs61a
some work
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [chan receive]:
main.main()
/home/user/code/src/github.com/myhub/cs61a/Main.go:18 +0x6f
$
Ideally main() should not exit, because channel resource is used by any one go-routine.
Why a go-routine block on channel considered as deadlock?

In Go a deadlock is when all existing goroutines are blocked.
Your example has a single goroutine–the main goroutine–which is blocked, hence all goroutines are blocked, hence it's a deadlock.
Note: since all goroutines are blocked, new goroutines will not (cannot) be launched (because they can only be launched from running goroutines). And if all goroutines are blocked and cannot do anything, there is no point in waiting forever for nothing. So the runtime exits.
Edit:
Your edited code where you use a sleep in main is a duplicate of this: Go channel deadlock is not happening. Basically a sleep is not a blocking forever operation (the sleep duration is finite), so a goroutine sleeping is not considered in deadlock detection.
Edit #2:
Since then you removed the sleep() but it doesn't change anything. You have 2 goroutines: the main and the one executing makeRandom(). makeRandom() is blocked and main() isn't. So not all your goroutines are blocked so it's not a deadlock.
Edit #3:
In your last example when the runtime detects the deadlock, then there is only a single goroutine still running: the main(). It's true that you launch a goroutine executing worker(), but that only prints a text and terminates. "Past" goroutines do not count, terminated goroutines also can't do anything to change the blocked state of existing goroutines. Only existing goroutines count.

Check out this article to understand exactly why a go-routine block on channel considered as deadlock:
http://dmitryvorobev.blogspot.com/2016/08/golang-channels-implementation.html
In your example above, the main goroutine gets added to the waiting queue(sendq) and cannot be released until Go runs some goroutine that receives a value from the channel.

Related

Why doesn't panic show all running goroutines?

Page 253 of The Go Programming Language states:
... if instead of returning from main in the event of cancellation, we execute a call to panic, then the runtime will dump the stack of every goroutine in the program.
This code deliberately leaks a goroutine by waiting on a channel that never has anything to receive:
package main
import (
"fmt"
"time"
)
func main() {
never := make(chan struct{})
go func() {
defer fmt.Println("End of child")
<-never
}()
time.Sleep(10 * time.Second)
panic("End of main")
}
However, the runtime only lists the main goroutine when panic is called:
panic: End of main
goroutine 1 [running]:
main.main()
/home/simon/panic/main.go:15 +0x7f
exit status 2
If I press Ctrl-\ to send SIGQUIT during the ten seconds before main panics, I do see the child goroutine listed in the output:
goroutine 1 [sleep]:
time.Sleep(0x2540be400)
/usr/lib/go-1.17/src/runtime/time.go:193 +0x12e
main.main()
/home/simon/panic/main.go:14 +0x6c
goroutine 18 [chan receive]:
main.main.func1()
/home/simon/panic/main.go:12 +0x76
created by main.main
/home/simon/panic/main.go:10 +0x5d
I thought maybe the channel was getting closed as panic runs (which still wouldn't guarantee the deferred fmt.Println had time to execute), but I get the same behaviour if the child goroutine does a time.Sleep instead of waiting on a channel.
I know there are ways to dump goroutine stacktraces myself, but my question is why doesn't panic behave as described in the book? The language spec only says that a panic will terminate the program, so is the book simply describing implementation-dependent behaviour?
Thanks to kostix for pointing me to the GOTRACEBACK runtime environment variable. Setting this to all instead of leaving it at the default of single restores the behaviour described in TGPL. Note that this variable is significant to the runtime, but you can't manipulate it with go env.
The default to only list the panicking goroutine is a change in go 1.6 - my edition of the book is copyrighted 2016 and gives go 1.5 as the prequisite for its example code, so it must predate the change. It's interesting reading the change discussion that there was concern about hiding useful information (as the recipient of many an incomplete error report, I can sympathise with this), but nobody called out the issue of scaling to large production systems that kostix mentioned.

How an os.Signal channel is handled internally in Go?

When having the code:
package main
import (
"os"
"os/signal"
)
func main() {
sig := make(chan os.Signal, 1)
signal.Notify(sig)
<-sig
}
Runs without problem, of course, blocking until you send a signal that interrupts the program.
But:
package main
func main() {
sig := make(chan int, 1)
<-sig
}
throws this error:
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [chan receive]:
main.main()
/home/user/project/src/main.go:5 +0x4d
exit status 2
While I understand why reading from the int channel causes a deadlock, I have only a suspicion
that the os.Signal doesn't because its channel can suffer writes from "the outside" as, well,
it handles signals and they come from outside the program.
Is my suspicion somewhat correct? If so, how the runtime handles this differently from other channel types?
Thank you!
You have a deadlock because try to receive message from channel but no other goroutine running that is no sender exists. In the same time call to signal.Notify starts watchSignalLoop() goroutine in background and you can verify implementation details here https://golang.org/src/os/signal/signal.go.
Channels don't care about element type unless your element type is larger than 64kB (strictly speaking, there are other nuances, please check the implementation).
Don't guess about how runtime works, make researches about it. For example, you can check what happens when you call make(chan int). You can do go tool compile -S main.go | grep main.go:line of make chan and check which function is called from runtime package. Then just jump to this file and invest your time to understand the implementation. You will see that implementation of channels is thin and straightforward comparing to other things
Hope it helps!

If the Wait() method of the sync.WaitGroup type blocks, and is thus not asynchronous, why use it?

I have been looking into Golang, and seeing how good its concurrency is with its enforcement of a coroutine-channel-only model through its innovative goroutines construct.
One thing that I immediately find troubling is the use of the Wait() method, used to wait until multiple outstanding goroutines spawned inside a parent goroutine have finished. To quote the Golang docs
Wait can be used to block until all goroutines have finished
The fact that many go developers prescribe Wait() as the preferred way to implement concurrency seems antithetical to Golang's mission of enabling developers to write efficient software, because blocking is inefficient, and truly asynchronous code never blocks.
A process [or thread] that is blocked is one that is waiting for some event, such as a resource becoming available or the completion of an I/O operation.
In other words, a blocked thread will spend CPU cycles doing nothing useful, just checking repeatedly to see if its currently running task can stop waiting and continue its execution.
In truly asynchronous code, when a coroutine encounters a situation where it cannot continue until a result arrives, it must yield its execution to the scheduler instead of blocking, by switching its state from running to waiting, so the scheduler can begin executing the next-in-line coroutine from the runnable queue. The waiting coroutine should have its state changed from waiting to runnable only once the result it needs has arrived.
Therefore, since Wait() blocks until x number of goroutines have invoked Done(), the goroutine which calls Wait() will always remain in either a runnable or running state, wasting CPU cycles and relying on the scheduler to preempt the long-running goroutine only to change its state from running to runnable, instead of changing it to waiting as it should be.
If all this is true, and I'm understanding how Wait() works correctly, then why aren't people using the built-in Go channels for the task of waiting for sub-goroutines to complete? If I understand correctly, sending to a buffered channel, and reading from any channel are both asynchronous operations, meaning that invoking them will put the goroutine into a waiting state, so why aren't they the preferred method?
The article I referenced gives a few examples. Here's what the author calls the "Old School" way:
package main
import (
"fmt"
"time"
)
func main() {
messages := make(chan int)
go func() {
time.Sleep(time.Second * 3)
messages <- 1
}()
go func() {
time.Sleep(time.Second * 2)
messages <- 2
}()
go func() {
time.Sleep(time.Second * 1)
messages <- 3
}()
for i := 0; i < 3; i++ {
fmt.Println(<-messages)
}
}
and here is the preferred, "Canonical" way:
package main
import (
"fmt"
"sync"
"time"
)
func main() {
messages := make(chan int)
var wg sync.WaitGroup
wg.Add(3)
go func() {
defer wg.Done()
time.Sleep(time.Second * 3)
messages <- 1
}()
go func() {
defer wg.Done()
time.Sleep(time.Second * 2)
messages <- 2
}()
go func() {
defer wg.Done()
time.Sleep(time.Second * 1)
messages <- 3
}()
wg.Wait()
for i := range messages {
fmt.Println(i)
}
}
I can understand that the second might be easier to understand than the first, but the first is asynchronous where no coroutines block, and the second has one coroutine which blocks: the one running the main function. Here is another example of Wait() being the generally accepted approach.
Why isn't Wait() considered an anti-pattern by the Go community if it creates an inefficient blocked thread? Why aren't channels preferred by most in this situation, since they can by used to keep all the code asynchronous and the thread optimized?
Your understanding of "blocking" is incorrect. Blocking operations such as WaitGroup.Wait() or a channel receive (when there is no value to receive) only block the execution of the goroutine, they do not (necessarily) block the OS thread which is used to execute the (statements of the) goroutine.
Whenever a blocking operation (such as the above mentioned) is encountered, the goroutine scheduler may (and it will) switch to another goroutine that may continue to run. There are no (significant) CPU cycles lost during a WaitGroup.Wait() call, if there are other goroutines that may continue to run, they will.
Please check related question: Number of threads used by Go runtime

Why does this cause a deadlock in Go?

This is not a question about how to better write this. It's a question specifically about why Go is causing a deadlock in this scenario.
package main
import "fmt"
func main() {
chan1 := make(chan bool)
chan2 := make(chan bool)
go func() {
for {
<-chan1
fmt.Printf("chan1\n")
chan2 <- true
}
}()
go func() {
for {
<-chan2
fmt.Printf("chan2\n")
chan1 <- true
}
}()
for {
chan1 <- true
}
}
Outputs:
chan1
chan2
chan1
chan2
chan1
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [chan send]:
goroutine 5 [chan send]:
goroutine 6 [chan send]:
exit status 2
Why does this not cause an infinite loop? How come it does two full "ping-pings" (instead of just one) before giving up?
From the runtime perspective you get a deadlock because all routines try to send onto a channel and there's no routine waiting to receive anything.
But why is it happening? I will give you a story as I like visualising what my routines are doing when I encounter a deadlock.
You have two players (routines) and one ball (true value). Every player waits for a ball and once they get it they pass it back to the other player (through a channel). This is what your two routines are really doing and this would indeed produce an infinite loop.
The problem is the third player introduced in your main loop. He's hiding behind the second player and once he sees the first player has empty hands, he throws another ball at him. So we end up with both players holding a ball, couldn't pass it to another player because the other one has (the first) ball in his hands already. The hidden, evil player is also trying to pass yet another ball. Everyone is confused, because there're three balls, three players and no empty hands.
In other words, you have introduced the third player who is breaking the game. He should be an arbiter passing the very first ball at the beginning of the game, watching it, but stop producing balls! It means, instead of having a loop in you main routine, there should be simply chan1 <- true (and some condition to wait, so we don't exit the program).
If you enable logging in the loop of main routine, you will see the deadlock occurs always on the third iteration. The number of times the other routines are executed depends on the scheduler. Bringing back the story: first iteration is a kick-off of the first ball; next iteration is a mysterious second ball, but this can be handled. The third iteration is a deadlock – it brings to life the third ball which can't be handled by anybody.
It looks complicated but the answer is easy.
It'll deadlock when:
First routine is trying to write to chan2
Second route is trying to write to chan1.
Main is trying to write to chan1.
How can that happen? Example:
Main writes chan1. Blocks on another write.
Routine 1: chan1 receives from Main. Prints. Blocks on write chan2.
Routine 2: chan2 receives. Prints. Blocks on write chan1.
Routine 1: chan1 receives from Routine 2. Prints. Blocks on write chan2.
Routine 2: chan2 receives. Prints. Blocks on write chan1.
Main writes chan1. Blocks on another write.
Routine 1: chan1 receives from Main. Prints. Blocks on write chan2.
Main writes chan1. Blocks on another write.
Currently all routines are blocked. i.e.:
Routine 1 cannot write to chan2 because Routine 2 is not receiving but is actually blocked trying to write to chan1. But no one is listening on chan1.
As #HectorJ said, it all depends on the scheduler. But in this setup, a deadlock is inevitable.
goroutine 1 [chan send]:
goroutine 5 [chan send]:
goroutine 6 [chan send]:
This tells it all: all your goroutines are blocked trying to send on a channel with no one to receive on the other end.
So your first goroutine blocks on chan2 <- true, your second blocks on chan1 <- true and your main goroutine blocks on its own chan1 <- true.
As to why it does two "full ping-pings" like you say, it depends on scheduling and from which sender <-chan1 decides to receive first.
On my computer, I get more and it varies each time I run it:
chan1
chan2
chan1
chan2
chan1
chan2
chan1
chan2
chan1
chan2
chan1
chan2
chan1
fatal error: all goroutines are asleep - deadlock!

Why does this code generate an error?

The below piece of code generates an error why?
func main() {
messages := make(chan string)
messages <- "test" //line 16
fmt.Println(<-messages)
}
Generates the below error.
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [chan send]:
main.main()
/tmp/sandbox994400718/main.go:16 +0x80
A value is sent to the channel, and in the next line it's being received. Technically it should work.
Channels can be buffered or unbuffered. A buffered channel can store a number of items “inside” it, but when you add something to a buffered channel the goroutine adding the item can only continue when another goroutine removes the item. There is no place to “leave” the item, it must be passed directly to the other goroutine, and the first goroutine will wait until another one take the item from it.
This is what is happening in your code. When you create a channel with make, if you don’t specify a capacity as the second argument you get an unbuffered channel. To create a buffered channel pass a second argument to make, e.g.
messages := make(chan string, 1) // could be larger than 1 if you want
This allows the goroutine to add the item (a string in this case) to the channel, where it will be available when another goroutine tries to get an item from the channel in the future, and the original goroutine can then continue processing.
I have learnt a lot about channels now, and now I'm able to answer the question.
In line 16 when the message "test" is sent to the channel by the main thread(goroutine) the execution pauses and the runtime looks for other goroutines which are ready to receive the value from the channel message. Since there are no other channels the runtime raises a panic with the deadlock message. This a classic example for deadlock.
To fix this there are two things that can be done.
1) Use buffered channels as Matt suggested(one of the answers).
2) Else have the statement that sends to channel or receives from channel in a go routine.
func main() {
messages := make(chan string)
go func() {
messages <- "test" //line 16
}()
fmt.Println(<-messages)
}
So the essential take aways from this is that,
1) Channels can be used only to communicate between goroutines i.e when you send to channel in one goroutine you can receive it only in another goroutine not the same.
2) When data is sent to a channel in a goroutine the flow/execution of that goroutine is paused until the data is received from the same channel in another goroutine.

Resources