I’m writing the Walk function in the go tutorial that basically traverses a tree in-order. What I have works:
package main
import (
"fmt"
"code.google.com/p/go-tour/tree"
)
// Walk walks the tree t sending all values
// from the tree to the channel ch.
func Walk__helper(t *tree.Tree, ch chan int) {
if (t == nil) {
return
}
Walk__helper(t.Left, ch)
ch <- t.Value
Walk__helper(t.Right, ch)
}
func Walk(t *tree.Tree, ch chan int) {
Walk__helper(t, ch)
close(ch)
}
func main() {
ch := make(chan int)
go Walk(tree.New(1), ch)
for v := range ch {
fmt.Println(v)
}
}
Why must I use go Walk(tree.New(1), ch) instead of just Walk(tree.New(1), ch)?
I was under the impression that the go keyword basically spawns a new thread. In that case, we’d run into issues since the for loop might run before the subroutine completes.
Strangely, when I take out the go keyword, I get a deadlock. This is rather counterintuitive to me. What exactly is the go keyword doing here?
The key point here is range when coupled with a channel.
When you range over a channel (in this case, ch), it will wait for items to be sent on the channel before iterating through the loop. This is a safe, "blocking" action, that will not deadlock while it waits for the channel to receive an item.
The deadlock occurs when not using a goroutine because your channel isn't buffered. If you don't use a goroutine, then the method call is synchronous, Walk puts something on the channel.. and it blocks until that is popped off. It never gets popped off.. because the method call was synchronous.
I was under the impression that the go keyword basically spawns a new thread
..that is incorrect. There are many more important implementation details required to understand what goes on there. You should separate your thought process of a goroutine from a thread.. and just think of a goroutine as a concurrently executing piece of code, without a "thread".
Related
I'm new to Golang and I have hard time figuring out why exactly the following code produces deadlock. Also, how can I fix it so it works?
package main
import "fmt"
func main() {
m := make(map[int]chan string)
go func() {
m[0] = make(chan string)
m[0] <- "abab"
}()
fmt.Println(<-m[0])
}
EDIT:
Thanks for your answers! Unfortunately, initializing m[0] with
m[0] = make(chan string)
before launching a new goroutine is not exactly what I want. My problem is: is there a way to create channels "dynamically"? E.g. I have a map m of type map[int]chan string and I receive requests that contain something like id of type int. I would like to send a message via channel map[id], but initializing channels for every int would be too costly. How do I solve/work around this?
So, in other words, I would like to have a separate job queue for every id and initialize each queue lazily.
Updated answer after OP updated the question
You can just loop on all the keys in your map, maybe have another goroutine that keeps looping on all the keys. Obviously if a key hasnt been initialized, then it wont come up in the for range loop. For each key, you can then start a goroutine that listens so it doesnt block, or you can use a buffered channels so they wont block up to the buffer limit. You can also preferably use a waitGroup, rather than the time.Sleep(), these are only for this trivial example.
package main
import (
"fmt"
"time"
)
func main() {
m := make(map[int]chan string)
go func() {
m[0] = make(chan string)
m[0] <- "abab"
}()
time.Sleep(time.Second * 1) //sleep so the above goroutine initializes the key 0 channel
for key := range m{ //loop on all non-nil keys
fmt.Println(key)
go func(k int){ // goroutine to listen on this channel
fmt.Println(<- m[k])
}(key)
}
time.Sleep(time.Second * 1) //sleep so u can see the effects of the channel recievers
}
Old answer
This is how the flow is. The main goroutine starts. The map is created. The main goroutine encounters another goroutine. It spawns said goroutine and goes on with its life. Then it meets this line, fmt.Println(<-m[0]), which is a problem, since the map is indeed initialized, but the channel in the map itself isnt initialized! By the time the main goroutine has reached fmt.Println(<-m[0]), the other goroutine hadn't yet initialized the channel! So its a simple fix, just initialize the channel before spawning the goroutine and you're good to go!
package main
import "fmt"
func main() {
m := make(map[int]chan string)
m[0] = make(chan string)
go func() {
m[0] <- "abab"
}()
fmt.Println(<-m[0])
}
Edit: Note that fmt.Println(<-m[0]) is blocking, which means that if in that other goroutine, you dont send on the channel, you will also go into a deadlock, since you are trying to recieve on the channel when no one is actually sending.
You need to synchronize the creation of a channel.
As it stands, your main thread arrives at <-m[0] while m[0] is still an uninitialized channel, and receiving on an uninitialized channel blocks forever.
Your go routine creates a new channel and places it in m[0], but the main go routine is already listening on the prior zero value. Sending on this new channel also blocks forever, as there is nothing reading from it, so all go routines block.
To fix this, move m[0] = make(chan string) above your go routine, so it happens synchronously.
I need a way to signal from one main goroutine, an unknown number of other goroutines, multiple times. I also need for those other goroutines to select on multiple items, so busy waiting is (probably) not an option. I have come up with the following solution:
package main
import (
"context"
"fmt"
"sync"
"time"
)
type signal struct {
data []int
channels []chan struct{}
}
func newSignal() *signal {
s := &signal{
data: make([]int, 0),
channels: make([]chan struct{}, 1),
}
s.channels[0] = make(chan struct{})
return s
}
func (s *signal) Broadcast(d int) {
s.data = append(s.data, d)
s.channels = append(s.channels, make(chan struct{}))
close(s.channels[len(s.data)-1])
}
func test(s *signal, wg *sync.WaitGroup, id int, ctx context.Context) {
for i := 0; ; i += 1 {
select {
case <-s.channels[i]:
if id >= s.data[i] {
fmt.Println("Goroutine completed:", id)
wg.Done()
return
}
case <-ctx.Done():
fmt.Println("Goroutine completed:", id)
wg.Done()
return
}
}
}
func main() {
s := newSignal()
ctx, cancel := context.WithCancel(context.Background())
wg := sync.WaitGroup{}
wg.Add(3)
go test(s, &wg, 3, ctx)
go test(s, &wg, 2, ctx)
go test(s, &wg, 1, ctx)
s.Broadcast(3)
time.Sleep(1 * time.Second)
// multiple broadcasts is mandatory
s.Broadcast(2)
time.Sleep(1 * time.Second)
// last goroutine
cancel()
wg.Wait()
}
Playground: https://play.golang.org/p/dGmlkTuj7Ty
Is there a more elegant way to do this? One that uses builtin libraries only. If not, is this a safe/ok to use solution? I believe it is safe at least, as it works for a large number of goroutines (I have done some testing with it).
To be concise, here is exactly what I want:
The main goroutine (call it M) must be able to signal with some data (call it d) some unknown number of other goroutines (call them n for 0...n), multiple times, with each goroutine taking an action based on d each time
M must be able to signal all of the other n goroutines with certain (numerical) data, multiple times
Every goroutine in n will either terminate on its own (based on a context) or after doing some operation with d and deciding its fate. It will be performing this check as many times as signaled until it dies.
I am not allowed to keep track of the n goroutines in any way (eg. having a map of channels to goroutines and iterating)
In my solution, the slices of channels do not represent goroutines: they actually represent signals that are being broadcast out. This means that if I broadcast twice, and then a goroutine spins up, it will check both signals before sleeping in the select block.
It seems to me that you might want something like a fan-out pattern. Here's one source describing fan-in and fan-out, among other concurrency patterns. Here's a blog post on golang.org about this too. I think it's essentially a version of observer pattern using channels.
Basically, you want something, say Broadcaster, that keeps a list of channels. When you call Broadcaster.send(data), it loops over the list of channels sending data on each channel. Broadcaster must also have a way for goroutines to subscribe to Broadcaster. Goroutines must have a way to accept a channel from Broadcasteror give a channel to Broadcaster. That channel is the communication link.
If the work to be performed in the "observer" goroutines will take long, consider using buffered channels so that Broadcaster is not blocking during send and waiting on goroutines. If you don't care if a goroutine misses a data, you can use a non-blocking send (see below).
When a goroutine "dies", it can unsubscribe from Broadcaster which will remove the appropriate channel from its list. Or the channel can just remain full, and Broadcaster will have to use a non-blocking send to skip over full channels to dead goroutines.
I can't say what I described is comprehensive or 100% correct. It's just a quick description of the first general things I'd try based on your problem statement.
TL;DR: A typical case of all goroutines are asleep, deadlock! but can't figure it out
I'm parsing the Wiktionary XML dump to build a DB of words. I defer the parsing of each article's text to a goroutine hoping that it will speed up the process.
It's 7GB and is processed in under 2 minutes in my machine when doing it serially, but if I can take advantage of all cores, why not.
I'm new to threading in general, I'm getting a all goroutines are asleep, deadlock! error.
What's wrong here?
This may not be performant at all, as it uses an unbuffered channel, so all goroutines effectively end up executing serially, but my idea is to learn and understand threading and to benchmark how long it takes with different alternatives:
unbuffered channel
different sized buffered channel
only calling as many goroutines at a time as there are runtime.NumCPU()
The summary of my code in pseudocode:
while tag := xml.getNextTag() {
wg.Add(1)
go parseTagText(chan, wg, tag.text)
// consume a channel message if available
select {
case msg := <-chan:
// do something with msg
default:
}
}
// reading tags finished, wait for running goroutines, consume what's left on the channel
for msg := range chan {
// do something with msg
}
// Sometimes this point is never reached, I get a deadlock
wg.Wait()
----
func parseTagText(chan, wg, tag.text) {
defer wg.Done()
// parse tag.text
chan <- whatever // just inform that the text has been parsed
}
Complete code:
https://play.golang.org/p/0t2EqptJBXE
In your complete example on the Go Playground, you:
Create a channel (line 39, results := make(chan langs)) and a wait-group (line 40, var wait sync.WaitGroup). So far so good.
Loop: in the loop, sometimes spin off a task:
if ...various conditions... {
wait.Add(1)
go parseTerm(results, &wait, text)
}
In the loop, sometimes do a non-blocking read from the channel (as shown in your question). No problem here either. But...
At the end of the loop, use:
for res := range results {
...
}
without ever calling close(results) in exactly one place, after all writers finish. This loop uses a blocking read from the channel. As long as some writer goroutine is still running, the blocking read can block without having the whole system stop, but when the last writer finishes writing and exits, there are no remaining writer goroutines. Any other remaining goroutines might rescue you, but there are none.
Since you use the var wait correctly (adding 1 in the right place, and calling Done() in the right place in the writer), the solution is to add one more goroutine, which will be the one to rescue you:
go func() {
wait.Wait()
close(results)
}()
You should spin off this rescuer goroutine just before entering the for res := range results loop. (If you spin it off any earlier, it might see the wait variable count down to zero too soon, just before it gets counted up again by spinning off another parseTerm.)
This anonymous function will block in the wait variable's Wait() function until the last writer goroutine has called the final wait.Done(), which will unblock this goroutine. Then this goroutine will call close(results), which will arrange for the for loop in your main goroutine to finish, unblocking that goroutine. When this goroutine (the rescuer) returns and thus terminates, there are no more rescuers, but we no longer need any.
(This main code then calls wait.Wait() unnecessarily: Since the for didn't terminate until the wait.Wait() in the new goroutine already unblocked, we know that this next wait.Wait() will return immediately. So we can drop this second call, although leaving it in is harmless.)
The problem is that nothing is closing the results channel, yet the range loop only exits when it closes. I've simplified your code to illustrate this and propsed a solution - basically consume the data in a goroutine:
// This is our producer
func foo(i int, ch chan int, wg *sync.WaitGroup) {
defer wg.Done()
ch <- i
fmt.Println(i, "done")
}
// This is our consumer - it uses a different WG to signal it's done
func consumeData(ch chan int, wg *sync.WaitGroup) {
defer wg.Done()
for x := range ch {
fmt.Println(x)
}
fmt.Println("ALL DONE")
}
func main() {
ch := make(chan int)
wg := sync.WaitGroup{}
// create the producers
for i := 0; i < 10; i++ {
wg.Add(1)
go foo(i, ch, &wg)
}
// create the consumer on a different goroutine, and sync using another WG
consumeWg := sync.WaitGroup{}
consumeWg.Add(1)
go consumeData(ch,&consumeWg)
wg.Wait() // <<<< means that the producers are done
close(ch) // << Signal the consumer to exit
consumeWg.Wait() // << Wait for the consumer to exit
}
I'm following a quick intro to Go and one of the examples is:
package main
import (
"fmt"
"time"
)
func worker(done chan bool) {
fmt.Print("working...")
time.Sleep(time.Second)
fmt.Println("done")
done <- true
}
func main() {
done := make(chan bool, 1)
go worker(done)
<-done
}
I understand whats occuring but I guess I'm not grasping the sequence of events or the limitations?
A channel is created called done with a buffer size of 1.
The channel is passed into a function
After the timer is complete it adds a true boolean to the channel
I'm not sure what the final <-done is doing though
from: https://gobyexample.com/channel-synchronization
Receiver operator <- followed by channel name (done in this case) is used to wait for a value written to channel from worker goroutine. (i.e this read operation will be blocking. If you omit <-done, main goroutine will exit immediately even before worker's goroutine start and you won't be able to see results)
You can do whatever you want with <-done as value: assign it to another variable, pass it as a parameter to another function or just ignore it as in your case... etc.
iam currently playing around with go routines, channels and sync.WaitGroup. I know waitgroup is used to wait for all go routines to complete based on weather wg.Done() has been called enough times to derement the value set in wg.Add().
i wrote a small bit of code to try to test this in golang playground. show below
var channel chan int
var wg sync.WaitGroup
func main() {
channel := make(chan int)
mynums := []int{1,2,3,4,5,6,7,8,9}
wg.Add(1)
go addStuff(mynums)
wg.Wait()
close(channel)
recieveStuff(channel)
}
func addStuff(mynums []int) {
for _, val := range mynums {
channel <- val
}
wg.Done()
}
func recieveStuff(channel chan int) {
for val := range channel{
fmt.Println(val)
}
}
Iam getting a deadlock error. iam trying to wait on the routing to return with wg.Wait()? then, close the channel. Afterwards, send the channel to the recievestuff method to output the values in the slice? but it doesnt work. Ive tried moving the close() method inside the go routine after the loop also as i thought i may of been trying to close on the wrong routine in main(). ive found this stuff relatively confusing so far coming from java and c#. Any help is appreciated.
The call to wg.Wait() wouldn't return until wg.Done() has been called once.
In addStuff(), you're writing values to a channel when there's no other goroutine to drain those values. Since the channel is unbuffered, the first call to channel <- val would block forever, resulting in a deadlock.
Moreover, the channel in addStuff() remains nil since you're creating a new variable binding inside main, instead of assigning to the package-level variable. Writing to a nil channel blocks forever:
channel := make(chan int) //doesn't affect the package-level variable
Here's a modified example that runs to completion by consuming all values from the channel:
https://play.golang.org/p/6gcyDWxov7