i am new in golang recently. i have a question about goroutine when use time.sleep function. here the code.
package main
import (
"fmt"
"time"
)
func go1(msg_chan chan string) {
for {
msg_chan <- "go1"
}
}
func go2(msg_chan chan string) {
for {
msg_chan <- "go2"
}
}
func count(msg_chan chan string) {
for {
msg := <-msg_chan
fmt.Println(msg)
time.Sleep(time.Second * 1)
}
}
func main() {
var c chan string
c = make(chan string)
go go1(c)
go go2(c)
go count(c)
var input string
fmt.Scanln(&input)
}
and output is
go1
go2
go1
go2
go1
go2
i think when count function is execute sleep function, go1 and go2 will execute in random sequence. so the out put maybe like
go1
go1
go2
go2
go2
go1
when i delete the sleep code in count function. the result as i supposed , it's random.
i am stucked in this issue.
thanks.
First thing to notice is that there are three go routines and all of them are independent of each other. The only thing that combines the two go routines with count routine is the channel on which both go routines are sending the values.
time.Sleep is not making the go routines synchronous. On using time.Sleep you are actually letting the count go routine to wait for that long which let other go routine to send the value on the channel which is available for the count go routine to be able to receive.
One more thing that you can do to check it is increase the number of CPU's which will give you the random result.
func GOMAXPROCS(n int) int
GOMAXPROCS sets the maximum number of CPUs that can be executing
simultaneously and returns the previous setting. If n < 1, it does not
change the current setting. The number of logical CPUs on the local
machine can be queried with NumCPU. This call will go away when the
scheduler improves.
The number of CPUs available simultaneously to executing goroutines is
controlled by the GOMAXPROCS shell environment variable, whose default
value is the number of CPU cores available. Programs with the
potential for parallel execution should therefore achieve it by
default on a multiple-CPU machine. To change the number of parallel
CPUs to use, set the environment variable or use the similarly-named
function of the runtime package to configure the run-time support to
utilize a different number of threads. Setting it to 1 eliminates the
possibility of true parallelism, forcing independent goroutines to
take turns executing.
Considering the part where output of the go routine is random, it is always random. But channels most probably work in queue which is FIFO(first in first out) as it depends on which value is available on the channel to b received. So whichever be the value available on the channel to be sent is letting the count go routine to wait and print that value.
Take for an example even if I am using time.Sleep the output is random:
package main
import (
"fmt"
"time"
)
func go1(msg_chan chan string) {
for i := 0; i < 10; i++ {
msg_chan <- fmt.Sprintf("%s%d", "go1:", i)
}
}
func go2(msg_chan chan string) {
for i := 0; i < 10; i++ {
msg_chan <- fmt.Sprintf("%s%d", "go2:", i)
}
}
func count(msg_chan chan string) {
for {
msg := <-msg_chan
fmt.Println(msg)
time.Sleep(time.Second * 1)
}
}
func main() {
var c chan string
c = make(chan string)
go go1(c)
go go2(c)
go count(c)
time.Sleep(time.Second * 20)
fmt.Println("finished")
}
This sometimes leads to race condition which is why we use synchronization either using channels or wait.groups.
package main
import (
"fmt"
"sync"
"time"
)
var wg sync.WaitGroup
func go1(msg_chan chan string) {
defer wg.Done()
for {
msg_chan <- "go1"
}
}
func go2(msg_chan chan string) {
defer wg.Done()
for {
msg_chan <- "go2"
}
}
func count(msg_chan chan string) {
defer wg.Done()
for {
msg := <-msg_chan
fmt.Println(msg)
time.Sleep(time.Second * 1)
}
}
func main() {
var c chan string
c = make(chan string)
wg.Add(1)
go go1(c)
wg.Add(1)
go go2(c)
wg.Add(1)
go count(c)
wg.Wait()
fmt.Println("finished")
}
Now coming to the part where you are using never ending for loop to send the values on a channel. So if you remove the time.Sleep your process will hang since the loop will never stop to send the values on the channel.
Related
Say I have a function
type Foo struct {}
func (a *Foo) Bar() {
// some expensive work - does some calls to redis
}
which gets executed within a goroutine at some point in my app. Lots of these may be executing at any given point. Prior to application termination, I would like to ensure all remaining goroutines have finished their work.
Can I do something like this:
type Foo struct {
wg sync.WaitGroup
}
func (a *Foo) Close() {
a.wg.Wait()
}
func (a *Foo) Bar() {
a.wg.Add(1)
defer a.wg.Done()
// some expensive work - does some calls to redis
}
Assuming here that Bar gets executed within a goroutine and many of these may be running at a given time and that Bar should not be called once Close is called and Close is called upon a sigterm or sigint.
Does this make sense?
Usually I would see the Bar function look like this:
func (a *Foo) Bar() {
a.wg.Add(1)
go func() {
defer a.wg.Done()
// some expensive work - does some calls to redis
}()
}
Yes, WaitGroup is the right answer. You can use WaitGroup.Add at anytime that the counter is greater than zero, as per doc.
Note that calls with a positive delta that occur when the counter is zero must happen before a Wait. Calls with a negative delta, or calls with a positive delta that start when the counter is greater than zero, may happen at any time. Typically this means the calls to Add should execute before the statement creating the goroutine or other event to be waited for. If a WaitGroup is reused to wait for several independent sets of events, new Add calls must happen after all previous Wait calls have returned. See the WaitGroup example.
But one trick is that, you should always keep the counter greater than zero, before Close is called. That usually means you should call wg.Add in NewFoo (or something like that) and wg.Done in Close. And to prevent multiple calls to Done ruining the wait group, you should wrap Close into sync.Once. You may also want to prevent new Bar() from being called.
WaitGroup is one way, however, the Go team introduced the errgroup for your use case exactly. The most inconvenient part of leaf bebop's answer, is the disregard for error handling. Error handling is the reason errgroup exists. And idiomatic go code should never swallow errors.
However, keeping the signatures of your Foo struct, (except a cosmetic workerNumber)—and no error handling—my proposal looks like this:
package main
import (
"fmt"
"math/rand"
"time"
"golang.org/x/sync/errgroup"
)
type Foo struct {
errg errgroup.Group
}
func NewFoo() *Foo {
foo := &Foo{
errg: errgroup.Group{},
}
return foo
}
func (a *Foo) Bar(workerNumber int) {
a.errg.Go(func() error {
select {
// simulates the long running clals
case <-time.After(time.Second * time.Duration(rand.Intn(10))):
fmt.Println(fmt.Sprintf("worker %d completed its work", workerNumber))
return nil
}
})
}
func (a *Foo) Close() {
a.errg.Wait()
}
func main() {
foo := NewFoo()
for i := 0; i < 10; i++ {
foo.Bar(i)
}
<-time.After(time.Second * 5)
fmt.Println("Waiting for workers to complete...")
foo.Close()
fmt.Println("Done.")
}
The benefit here, is that if you introduce error handling in your code (you should), you only need to slightly modify this code: In short, errg.Wait() would return the first redis error, and Close() could propagate this up through the stack (to main, in this case).
Utilizing the context.Context package as well, you would also be able to immediately cancel any running redis call, if one fails. There are examples of this in the errgroup documentation.
I think waiting indefinitely for all the go routines to finish is not the right way.
If one of the go routines get blocked or say it hangs due to some reason and never terminates successfully, what should happen kill the process or wait for go routines to finish ?
Instead you should wait with some timeout and kill the app irrespective of whether all the routines have finished or not.
Edit: Original ans
Thanks #leaf bebop for pointing it out. I misunderstood the question.
Context package can be used to signal all the go routines to handle kill signal.
appCtx, cancel := context.WithCancel(context.Background())
Here appCtx will have to be passed to all the go routines.
On exit signal call cancel().
functions running as go routines can handle how to handle cancel context.
Using context cancellation in Go
A pattern i use a lot is: https://play.golang.org/p/ibMz36TS62z
package main
import (
"fmt"
"sync"
"time"
)
type response struct {
message string
}
func task(i int, done chan response) {
time.Sleep(1 * time.Second)
done <- response{fmt.Sprintf("%d done", i)}
}
func main() {
responses := GetResponses(10)
fmt.Println("all done", len(responses))
}
func GetResponses(n int) []response {
donequeue := make(chan response)
wg := sync.WaitGroup{}
for i := 0; i < n; i++ {
wg.Add(1)
go func(value int) {
defer wg.Done()
task(value, donequeue)
}(i)
}
go func() {
wg.Wait()
close(donequeue)
}()
responses := []response{}
for result := range donequeue {
responses = append(responses, result)
}
return responses
}
this makes it easy to throttle as well: https://play.golang.org/p/a4MKwJKj634
package main
import (
"fmt"
"sync"
"time"
)
type response struct {
message string
}
func task(i int, done chan response) {
time.Sleep(1 * time.Second)
done <- response{fmt.Sprintf("%d done", i)}
}
func main() {
responses := GetResponses(10, 2)
fmt.Println("all done", len(responses))
}
func GetResponses(n, concurrent int) []response {
throttle := make(chan int, concurrent)
for i := 0; i < concurrent; i++ {
throttle <- i
}
donequeue := make(chan response)
wg := sync.WaitGroup{}
for i := 0; i < n; i++ {
wg.Add(1)
<-throttle
go func(value int) {
defer wg.Done()
throttle <- 1
task(value, donequeue)
}(i)
}
go func() {
wg.Wait()
close(donequeue)
}()
responses := []response{}
for result := range donequeue {
responses = append(responses, result)
}
return responses
}
I want to send a value in a channel to go routines from main function. What happens is which go routine will receive the value from the channel first.
package main
import (
"fmt"
"math/rand"
//"runtime"
"strconv"
"time"
)
func main() {
var ch chan int
ch = make(chan int)
ch <- 1
receive(ch)
}
func receive(ch chan int){
for i := 0; i < 4; i++ {
// Create some threads
go func(i int) {
time.Sleep(time.Duration(rand.Intn(1000)) * time.Millisecond)
fmt.Println(<-ch)
}(i)
}
}
My current implementation is giving an error.
fatal error: all goroutines are asleep - deadlock!
How can I know that which go routine will receive the value from the channel first. And what happens to other go routine If those will run or throw an error since there is no channel to receive the value. As it is already received by one of them.
If a create a buffered channel my code works. So I don't get it what has happened behind the scene which is making it work when creating a buffered channel like below:
func main() {
var ch chan int
ch = make(chan int, 10)
ch <- 1
receive(ch)
}
If we look at below code. I can see that we can send values through channels directly there is no need of creating a go routine to send a value thorugh a channel to another go routines.
package main
import "fmt"
func main() {
// We'll iterate over 2 values in the `queue` channel.
queue := make(chan string, 2)
queue <- "one"
queue <- "two"
close(queue)
for elem := range queue {
fmt.Println(elem)
}
}
Then what is wrong with my code. Why is it creating a deadlock.
If all you need is to start several workers and send a task to any of them, then you'd better run workers before sending a value to a channel, because as #mkopriva said above, writing to a channel is a blocking operation. You always have to have a consumer, or the execution will freeze.
func main() {
var ch chan int
ch = make(chan int)
receive(ch)
ch <- 1
}
func receive(ch chan int) {
for i := 0; i < 4; i++ {
// Create some threads
go func(i int) {
time.Sleep(time.Duration(rand.Intn(1000)) * time.Millisecond)
fmt.Printf("Worker no %d is processing the value %d\n", i, <-ch)
}(i)
}
}
Short answer for the question "Which go routine will receive it?" - Whatever. :) Any of them, you can't say for sure.
However I have no idea what is time.Sleep(...) for there, kept it as is.
An unbuffered channel (without a length) blocks until the value has been received. This means the program that wrote to the channel will stop after writing to the channel until it has been read from. If that happens in the main thread, before your call to receive, it causes a deadlock.
There are two more issues: you need to use a WaitGroup to pause the completion until finished, and a channel behaves like a concurrent queue. In particular, it has push and pop operations, which are both performed using <-. For example:
//Push to channel, channel contains 1 unless other things were there
c <- 1
//Pop from channel, channel is empty
x := <-c
Here is a working example:
package main
import (
"fmt"
"math/rand"
"sync"
"time"
)
func main() {
var ch chan int
ch = make(chan int)
go func() {
ch <- 1
ch <- 1
ch <- 1
ch <- 1
}()
receive(ch)
}
func receive(ch chan int) {
wg := &sync.WaitGroup{}
for i := 0; i < 4; i++ {
// Create some threads
wg.Add(1)
go func(i int) {
time.Sleep(time.Duration(rand.Intn(1000)) * time.Millisecond)
fmt.Println(<-ch)
wg.Done()
}(i)
}
wg.Wait()
fmt.Println("done waiting")
}
Playground Link
As you can see the WaitGroup is quite simple as well. You declare it at a higher scope. It's essentially a fancy counter, with three primary methods. When you call wg.Add(1) the counter is increased, when you call wg.Done() the counter is decreased, and when you call wg.Wait(), the execution is halted until the counter reaches 0.
I have a bunch of goroutines doing something in a loop. I want to be able to pause all of them, run some arbitrary code, then resume them. The way I attempted to do this is probably not idiomatic (and I'd appreciate a better solution), but I can't understand why it doesn't work.
Stripped down to the essentials (driver code at the bottom):
type looper struct {
pause chan struct{}
paused sync.WaitGroup
resume chan struct{}
}
func (l *looper) loop() {
for {
select {
case <-l.pause:
l.paused.Done()
<-l.resume
default:
dostuff()
}
}
}
func (l *looper) whilePaused(fn func()) {
l.paused.Add(32)
l.resume = make(chan struct{})
close(l.pause)
l.paused.Wait()
fn()
l.pause = make(chan struct{})
close(l.resume)
}
I spin up 32 goroutines all running loop(), then call whilePaused 100 times in a row, and everything seems to work… but if I run it with -race, it tells me that there's a race on l.resume between writing it in whilePaused (l.resume = make(chan struct{})) and reading it in loop (<-l.resume).
I don't understand why this happens. According to The Go Memory Model, that close(l.pause) should happen before the <-l.pause in every loop goroutine. This should mean the make(chan struct{}) value is visible as the value of l.resume in all of those loop goroutines, in the same way the string "hello world" is visible as the value of a in the f goroutine in the docs example.
Some additional information that might be relevant:
If I replace l.resume with an unsafe.Pointer and access the chan struct{} value with atomic.LoadPointer in loop and atomic.StorePointer in whilePaused, the race goes away. This seems to be providing the exact same acquire-release ordering that the channel is already supposed to provide?
If I add a time.Sleep(10 * time.Microsecond) between the l.paused.Done() and <-l.resume, the program usually deadlocks after calling fn one or two times.
If I add a fmt.Printf(".") instead, the program prints 28 .s, calls the first function, prints another 32 .s, then hangs (or, occasionally, calls the second function, then prints another 32 .s and hangs).
Here's the rest of my code, in case you want to run the whole thing:
package main
import (
"fmt"
"sync"
"sync/atomic"
)
// looper code from above
var n int64
func dostuff() {
atomic.AddInt64(&n, 1)
}
func main() {
l := &looper{
pause: make(chan struct{}),
}
var init sync.WaitGroup
init.Add(32)
for i := 0; i < 32; i++ {
go func() {
init.Done()
l.loop()
}()
}
init.Wait()
for i := 0; i < 100; i++ {
l.whilePaused(func() { fmt.Printf("%d ", i) })
}
fmt.Printf("\n%d\n", atomic.LoadInt64(&n))
}
This is because after the thread does l.paused.Done(), the other thread is able to go around the loop and assign l.resume again
Here is sequence of operations
Looper thread | Pauser thread
------------------------------------
l.paused.Done() |
| l.paused.Wait()
| l.pause = make(chan struct{})
| round the loop
| l.paused.Add(numThreads)
<- l.resume | l.resume = make(chan struct{}) !!!RACE!!
Sorry about the noob question but I'm having a hard time wrapping my head around the concurrency part of go. Basically this program below is a simplified version of a larger one I'm writing, thus I want to keep the structure similar to below.
Basically instead of waiting 4 seconds I want to run addCount(..) concurrent using the unbuffered channel and when all elements in the int_slice has been processed I want to do another operation on them. However this program ends with a "panic: close of closed channel" and if I remove the closing of the channel I'm getting the output I'm expecting but it panics with: "fatal error: all goroutines are asleep - deadlock"
How can I implement the concurrency part correctly in this scenario?
Thanks in advance!
package main
import (
"fmt"
"time"
)
func addCount(num int, counter chan<- int) {
time.Sleep(time.Second * 2)
counter <- num * 2
}
func main() {
counter := make(chan int)
int_slice := []int{2, 4}
for _, item := range int_slice {
go addCount(item, counter)
close(counter)
}
for item := range counter {
fmt.Println(item)
}
}
Here are the issues I spotted in the code, and below a working version based on your implementation.
If a goroutine tries to write to an "unbuffered" channel, it will block until someone reads from it. Since you are not reading until they finish writing to the channel, you have a deadlock there.
Closing the channel while they are blocked breaks the deadlock, but gives an error since they now can't write to a closed channel.
Solution involves:
Creating a buffered channel so that they can write without blocking.
Using a sync.WaitGroup so that you wait for the goroutines to finish before closing the channel.
Reading from the channel at the end, when all is done.
See here, with comments:
package main
import (
"fmt"
"time"
"sync"
)
func addCount(num int, counter chan<- int, wg *sync.WaitGroup) {
// clear one from the sync group
defer wg.Done()
time.Sleep(time.Second * 2)
counter <- num * 2
}
func main() {
int_slice := []int{2, 4}
// make the slice buffered using the slice size, so that they can write without blocking
counter := make(chan int, len(int_slice))
var wg sync.WaitGroup
for _, item := range int_slice {
// add one to the sync group, to mark we should wait for one more
wg.Add(1)
go addCount(item, counter, &wg)
}
// wait for all goroutines to end
wg.Wait()
// close the channel so that we not longer expect writes to it
close(counter)
// read remaining values in the channel
for item := range counter {
fmt.Println(item)
}
}
For the sake of having examples here's a slightly modified version of what #eugenioy submitted. It allows for the use of an unbuffered channel and the reading of values as they come in instead of at the end like a regular for loop.
package main
import (
"fmt"
"sync"
"time"
)
func addCount(num int, counter chan<- int, wg *sync.WaitGroup) {
// clear one from the sync group
defer wg.Done()
// not needed, unless you wanted to slow down the output
time.Sleep(time.Second * 2)
counter <- num * 2
}
func main() {
// variable names don't have underscores in Go
intSlice := []int{2, 4}
counter := make(chan int)
var wg sync.WaitGroup
for _, item := range intSlice {
// add one to the sync group, to mark we should wait for one more
wg.Add(1)
go addCount(item, counter, &wg)
}
// by wrapping wait and close in a go routine I can start reading the channel before its done, I also don't need to know the size of the
// slice
go func() {
wg.Wait()
close(counter)
}()
for item := range counter {
fmt.Println(item)
}
}
package main
import (
"fmt"
"time"
)
func addCount(num int, counter chan <- int) {
time.Sleep(time.Second * 2)
counter <- num * 2
}
func main() {
counter := make(chan int)
int_slice := []int{2, 4}
for _, item := range int_slice {
go addCount(item, counter)
fmt.Println(<-counter)
}
}
This code selects all xml files in the same folder, as the invoked executable and asynchronously applies processing to each result in the callback method (in the example below, just the name of the file is printed out).
How do I avoid using the sleep method to keep the main method from exiting? I have problems wrapping my head around channels (I assume that's what it takes, to synchronize the results) so any help is appreciated!
package main
import (
"fmt"
"io/ioutil"
"path"
"path/filepath"
"os"
"runtime"
"time"
)
func eachFile(extension string, callback func(file string)) {
exeDir := filepath.Dir(os.Args[0])
files, _ := ioutil.ReadDir(exeDir)
for _, f := range files {
fileName := f.Name()
if extension == path.Ext(fileName) {
go callback(fileName)
}
}
}
func main() {
maxProcs := runtime.NumCPU()
runtime.GOMAXPROCS(maxProcs)
eachFile(".xml", func(fileName string) {
// Custom logic goes in here
fmt.Println(fileName)
})
// This is what i want to get rid of
time.Sleep(100 * time.Millisecond)
}
You can use sync.WaitGroup. Quoting the linked example:
package main
import (
"net/http"
"sync"
)
func main() {
var wg sync.WaitGroup
var urls = []string{
"http://www.golang.org/",
"http://www.google.com/",
"http://www.somestupidname.com/",
}
for _, url := range urls {
// Increment the WaitGroup counter.
wg.Add(1)
// Launch a goroutine to fetch the URL.
go func(url string) {
// Decrement the counter when the goroutine completes.
defer wg.Done()
// Fetch the URL.
http.Get(url)
}(url)
}
// Wait for all HTTP fetches to complete.
wg.Wait()
}
WaitGroups are definitely the canonical way to do this. Just for the sake of completeness, though, here's the solution that was commonly used before WaitGroups were introduced. The basic idea is to use a channel to say "I'm done," and have the main goroutine wait until each spawned routine has reported its completion.
func main() {
c := make(chan struct{}) // We don't need any data to be passed, so use an empty struct
for i := 0; i < 100; i++ {
go func() {
doSomething()
c <- struct{}{} // signal that the routine has completed
}()
}
// Since we spawned 100 routines, receive 100 messages.
for i := 0; i < 100; i++ {
<- c
}
}
sync.WaitGroup can help you here.
package main
import (
"fmt"
"sync"
"time"
)
func wait(seconds int, wg * sync.WaitGroup) {
defer wg.Done()
time.Sleep(time.Duration(seconds) * time.Second)
fmt.Println("Slept ", seconds, " seconds ..")
}
func main() {
var wg sync.WaitGroup
for i := 0; i <= 5; i++ {
wg.Add(1)
go wait(i, &wg)
}
wg.Wait()
}
Although sync.waitGroup (wg) is the canonical way forward, it does require you do at least some of your wg.Add calls before you wg.Wait for all to complete. This may not be feasible for simple things like a web crawler, where you don't know the number of recursive calls beforehand and it takes a while to retrieve the data that drives the wg.Add calls. After all, you need to load and parse the first page before you know the size of the first batch of child pages.
I wrote a solution using channels, avoiding waitGroup in my solution the the Tour of Go - web crawler exercise. Each time one or more go-routines are started, you send the number to the children channel. Each time a go routine is about to complete, you send a 1 to the done channel. When the sum of children equals the sum of done, we are done.
My only remaining concern is the hard-coded size of the the results channel, but that is a (current) Go limitation.
// recursionController is a data structure with three channels to control our Crawl recursion.
// Tried to use sync.waitGroup in a previous version, but I was unhappy with the mandatory sleep.
// The idea is to have three channels, counting the outstanding calls (children), completed calls
// (done) and results (results). Once outstanding calls == completed calls we are done (if you are
// sufficiently careful to signal any new children before closing your current one, as you may be the last one).
//
type recursionController struct {
results chan string
children chan int
done chan int
}
// instead of instantiating one instance, as we did above, use a more idiomatic Go solution
func NewRecursionController() recursionController {
// we buffer results to 1000, so we cannot crawl more pages than that.
return recursionController{make(chan string, 1000), make(chan int), make(chan int)}
}
// recursionController.Add: convenience function to add children to controller (similar to waitGroup)
func (rc recursionController) Add(children int) {
rc.children <- children
}
// recursionController.Done: convenience function to remove a child from controller (similar to waitGroup)
func (rc recursionController) Done() {
rc.done <- 1
}
// recursionController.Wait will wait until all children are done
func (rc recursionController) Wait() {
fmt.Println("Controller waiting...")
var children, done int
for {
select {
case childrenDelta := <-rc.children:
children += childrenDelta
// fmt.Printf("children found %v total %v\n", childrenDelta, children)
case <-rc.done:
done += 1
// fmt.Println("done found", done)
default:
if done > 0 && children == done {
fmt.Printf("Controller exiting, done = %v, children = %v\n", done, children)
close(rc.results)
return
}
}
}
}
Full source code for the solution
Here is a solution that employs WaitGroup.
First, define 2 utility methods:
package util
import (
"sync"
)
var allNodesWaitGroup sync.WaitGroup
func GoNode(f func()) {
allNodesWaitGroup.Add(1)
go func() {
defer allNodesWaitGroup.Done()
f()
}()
}
func WaitForAllNodes() {
allNodesWaitGroup.Wait()
}
Then, replace the invocation of callback:
go callback(fileName)
With a call to your utility function:
util.GoNode(func() { callback(fileName) })
Last step, add this line at the end of your main, instead of your sleep. This will make sure the main thread is waiting for all routines to finish before the program can stop.
func main() {
// ...
util.WaitForAllNodes()
}