In Golang, how to handle many goroutines with channel - go

I'm thinking start 1000 goroutines at the same time using for loop in Golang.
The problem is: I have to make sure that every goroutine has been executed.
Is it possible using channels to help me make sure of that?
The structure is kinda like this:
func main {
for i ... {
go ...
ch?
ch?
}

As #Andy mentioned You can use sync.WaitGroup to achieve this. Below is an example. Hope the code is self-explanatory.
package main
import (
"fmt"
"sync"
"time"
)
func dosomething(millisecs int64, wg *sync.WaitGroup) {
defer wg.Done()
duration := time.Duration(millisecs) * time.Millisecond
time.Sleep(duration)
fmt.Println("Function in background, duration:", duration)
}
func main() {
arr := []int64{200, 400, 150, 600}
var wg sync.WaitGroup
for _, n := range arr {
wg.Add(1)
go dosomething(n, &wg)
}
wg.Wait()
fmt.Println("Done")
}

To make sure the goroutines are done and collect the results, try this example:
package main
import (
"fmt"
)
const max = 1000
func main() {
for i := 1; i <= max; i++ {
go f(i)
}
sum := 0
for i := 1; i <= max; i++ {
sum += <-ch
}
fmt.Println(sum) // 500500
}
func f(n int) {
// do some job here and return the result:
ch <- n
}
var ch = make(chan int, max)
In order to wait for 1000 goroutines to finish, try this example:
package main
import (
"fmt"
"sync"
)
func main() {
wg := &sync.WaitGroup{}
for i := 0; i < 1000; i++ {
wg.Add(1)
go f(wg, i)
}
wg.Wait()
fmt.Println("Done.")
}
func f(wg *sync.WaitGroup, n int) {
defer wg.Done()
fmt.Print(n, " ")
}

I would suggest that you follow a pattern. Concurrency and Channel is Good but if you use it in a bad way, your program might became even slower than expected. The simple way to handle multiple go-routine and channel is by a worker pool pattern.
Take a close look at the code below
// In this example we'll look at how to implement
// a _worker pool_ using goroutines and channels.
package main
import "fmt"
import "time"
// Here's the worker, of which we'll run several
// concurrent instances. These workers will receive
// work on the `jobs` channel and send the corresponding
// results on `results`. We'll sleep a second per job to
// simulate an expensive task.
func worker(id int, jobs <-chan int, results chan<- int) {
for j := range jobs {
fmt.Println("worker", id, "started job", j)
time.Sleep(time.Second)
fmt.Println("worker", id, "finished job", j)
results <- j * 2
}
}
func main() {
// In order to use our pool of workers we need to send
// them work and collect their results. We make 2
// channels for this.
jobs := make(chan int, 100)
results := make(chan int, 100)
// This starts up 3 workers, initially blocked
// because there are no jobs yet.
for w := 1; w <= 3; w++ {
go worker(w, jobs, results)
}
// Here we send 5 `jobs` and then `close` that
// channel to indicate that's all the work we have.
for j := 1; j <= 5; j++ {
jobs <- j
}
close(jobs)
// Finally we collect all the results of the work.
for a := 1; a <= 5; a++ {
<-results
}
}
This simple example is taken from here . Also the results channel can help you keep track of all the go routines executing the jobs including failure notice.

Related

How to properly delay between executing a pool of workers

Good day,
I'm trying to implement the correct delay between the execution of workers, for example, it is necessary for the workers to complete 30 tasks and go to sleep for 5 seconds, how can I track in the code that exactly 30 tasks have been completed and only after that go to sleep for 5 seconds?
Below is the code that creates a pool of 30 workers, who, in turn, perform tasks of 30 pieces at a time in an unordered manner, here is the code:
import (
"fmt"
"math/rand"
"sync"
"time"
)
type Job struct {
id int
randomno int
}
type Result struct {
job Job
sumofdigits int
}
var jobs = make(chan Job, 10)
var results = make(chan Result, 10)
func digits(number int) int {
sum := 0
no := number
for no != 0 {
digit := no % 10
sum += digit
no /= 10
}
time.Sleep(2 * time.Second)
return sum
}
func worker(wg *sync.WaitGroup) {
for job := range jobs {
output := Result{job, digits(job.randomno)}
results <- output
}
wg.Done()
}
func createWorkerPool(noOfWorkers int) {
var wg sync.WaitGroup
for i := 0; i < noOfWorkers; i++ {
wg.Add(1)
go worker(&wg)
}
wg.Wait()
close(results)
}
func allocate(noOfJobs int) {
for i := 0; i < noOfJobs; i++ {
if i != 0 && i%30 == 0 {
fmt.Printf("SLEEPAGE 5 sec...")
time.Sleep(10 * time.Second)
}
randomno := rand.Intn(999)
job := Job{i, randomno}
jobs <- job
}
close(jobs)
}
func result(done chan bool) {
for result := range results {
fmt.Printf("Job id %d, input random no %d , sum of digits %d\n", result.job.id, result.job.randomno, result.sumofdigits)
}
done <- true
}
func main() {
startTime := time.Now()
noOfJobs := 100
go allocate(noOfJobs)
done := make(chan bool)
go result(done)
noOfWorkers := 30
createWorkerPool(noOfWorkers)
<-done
endTime := time.Now()
diff := endTime.Sub(startTime)
fmt.Println("total time taken ", diff.Seconds(), "seconds")
}
play: https://go.dev/play/p/lehl7hoo-kp
It is not clear exactly how to make sure that 30 tasks are completed and where to insert the delay, I will be grateful for any help
Okey, so let's start with this working example:
func Test_t(t *testing.T) {
// just a published, this publishes result on a chan
publish := func(s int, ch chan int, wg *sync.WaitGroup) {
ch <- s // this is blocking!!!
wg.Done()
}
wg := &sync.WaitGroup{}
wg.Add(100)
// we'll use done channel to notify the work is done
res := make(chan int)
done := make(chan struct{})
// create worker that will notify that all results were published
go func() {
wg.Wait()
done <- struct{}{}
}()
// let's create a jobs that publish on our res chan
// please note all goroutines are created immediately
for i := 0; i < 100; i++ {
go publish(i, res, wg)
}
// lets get 30 args and then wait
var resCounter int
forloop:
for {
select {
case ss := <-res:
println(ss)
resCounter += 1
// break the loop
if resCounter%30 == 0 {
// after receiving 30 results we are blocking this thread
// no more results will be taken from the channel for 5 seconds
println("received 30 results, waiting...")
time.Sleep(5 * time.Second)
}
case <-done:
// we are done here, let's break this infinite loop
break forloop
}
}
}
I hope this shows moreover how it can be done.
So, what's the problem with your code?
To be honest, it looks fine (I mean 30 results are published, then the code wait, then another 30 results, etc.), but the question is where would you like to wait?
There are a few possibilities I guess:
creating workers (this is how your code works now, as I see, it publishes jobs in 30-packs; please notice that the 2-second delay you have in the digit function is applicable only to the goroutine the code is executed)
triggering workers (so the "wait" code should be in worker function, not allowing to run more workers - so it must watch how many results were published)
handling results (this is how my code works and proper synchronization is in the forloop)

Getting deadlock as I try to emulate fan in - fan out with factorial calculations

I am trying the fan in - fan out pattern with a factorial problem. But I am getting:
fatal error: all goroutines are asleep - deadlock!
and unable to identify the reason for deadlock.
I am trying to concurrently calculate factorial for 100 numbers using the fan-in fan-out pattern.
package main
import (
"fmt"
)
func main() {
_inChannel := _inListener(generator())
for val := range _inChannel {
fmt.Print(val, " -- ")
}
}
func generator() chan int { // NEED TO CALCULATE FACTORIAL FOR 100 NUMBERS
ch := make(chan int) // CREATE CHANNEL TO INPUT NUMBERS
go func() {
for i := 1; i <= 100; i++ {
ch <- i
}
close(ch) // CLOSE CHANNEL WHEN ALL NUMBERS HAVE BEEN WRITTEM
}()
return ch
}
func _inListener(ch chan int) chan int {
rec := make(chan int) // CHANNEL RECEIVED FROM GENERATOR
go func() {
for num := range ch { // RECEIVE THE INPUT NUMBERS FROM GENERATOR
result := factorial(num) // RESULT IS A NEW CHANNEL CREATED
rec <- <-result // MERGE INTO A SINGLE CHANNEL; rec
close(result)
}
close(rec)
}()
return rec // RETURN THE DEDICATED CHANNEL TO RECEIVE ALL OUTPUTS
}
func factorial(n int) chan int {
ch := make(chan int) // MAKE A NEW CHANNEL TO OUTPUT THE RESULT
// OF FACTORIAL
total := 1
for i := n; i > 0; i-- {
total *= i
}
ch <- total
return ch // RETURN THE CHANNEL HAVING THE FACTORIAL CALCULATED
}
I have put in comments, so that it becomes easier to follow the code.
I'm no expert in channels. I've taking on this to try and get more familiar with go.
Another issue is the int isn't large enough to take all factorials over 20 or so.
As you can see, I added a defer close as well as a logical channel called done in the generator func. The rest of the changes probably aren't needed. With channels you need to make sure something is ready to take off a value on the channel when you put something on a channel. Otherwise deadlock. Also, using
go run -race main.go
helps at least see which line(s) are causing problems.
I hope this helps and isn't removed for being off topic.
I was able to remove the deadlock by doing this:
package main
import (
"fmt"
)
func main() {
_gen := generator()
_inChannel := _inListener(_gen)
for val := range _inChannel {
fmt.Print(val, " -- \n")
}
}
func generator() chan int { // NEED TO CALCULATE FACTORIAL FOR 100 NUMBERS
ch := make(chan int) // CREATE CHANNEL TO INPUT NUMBERS
done := make(chan bool)
go func() {
defer close(ch)
for i := 1; i <= 100; i++ {
ch <- i
}
//close(ch) // CLOSE CHANNEL WHEN ALL NUMBERS HAVE BEEN WRITTEM
done <- true
}()
// this function will pull off the done for each function call above.
go func() {
for i := 1; i < 100; i++ {
<-done
}
}()
return ch
}
func _inListener(ch chan int) chan int {
rec := make(chan int) // CHANNEL RECEIVED FROM GENERATOR
go func() {
for num := range ch { // RECEIVE THE INPUT NUMBERS FROM GENERATOR
result := factorial(num) // RESULT IS A NEW CHANNEL CREATED
rec <- result // MERGE INTO A SINGLE CHANNEL; rec
}
close(rec)
}()
return rec // RETURN THE DEDICATED CHANNEL TO RECEIVE ALL OUTPUTS
}
func factorial(n int) int {
// OF FACTORIAL
total := 1
for i := n; i > 0; i-- {
total *= i
}
return total // RETURN THE CHANNEL HAVING THE FACTORIAL CALCULATED
}

GO program stuck in a loop

// _Closing_ a channel indicates that no more values
// will be sent on it. This can be useful to communicate
// completion to the channel's receivers.
package main
import "fmt"
// In this example we'll use a `jobs` channel to
// communicate work to be done from the `main()` goroutine
// to a worker goroutine. When we have no more jobs for
// the worker we'll `close` the `jobs` channel.
func main() {
jobs := make(chan int, 5)
done := make(chan bool)
// Here's the worker goroutine. It repeatedly receives
// from `jobs` with `j, more := <-jobs`. In this
// special 2-value form of receive, the `more` value
// will be `false` if `jobs` has been `close`d and all
// values in the channel have already been received.
// We use this to notify on `done` when we've worked
// all our jobs.
for i := 1; i <= 3; i++ {
go func() {
for {
j, more := <-jobs
if more {
fmt.Println("received job", j)
} else {
fmt.Println("received all jobs")
done <- true
return
}
}
}()
}
// This sends 3 jobs to the worker over the `jobs`
// channel, then closes it.
j := 0
for {
j++
jobs <- j
fmt.Println("sent job", j)
}
close(jobs)
fmt.Println("sent all jobs")
// We await the worker using the
// [synchronization](channel-synchronization) approach
// we saw earlier.
<-done
}
https://play.golang.org/p/x28R_g8ftS
What I'm trying to do is get all the responses from a paginated url endpoint. jobs is a channel storing the page number. I have a function in if more{} checking for empty reponse and I have
done <- true
return
I thought this would close the go routine.
But, the page generator for{j++; jobs <- j} is causing it to get stuck in a loop. Any idea how this can be resolved?
By definition a for loop without conditions is an infinite loop. Unless you put some logic to break this infinite loop, you'll never get out of it.
In your playground your comment implies that you want to send 3 jobs. You should change your for loop accordingly:
for j := 0; j < 3; j++ {
jobs <- j
fmt.Println("sent job", j)
}
This is a simplified version of a worker.. Its not very useful for production level traffic, but should serve as a simple example, there are tons of them :-)
package main
import (
"log"
"sync"
)
type worker struct {
jobs chan int
wg *sync.WaitGroup
}
func main() {
w := worker{
jobs: make(chan int, 5), // I only want to work on 5 jobs at any given time
wg: new(sync.WaitGroup),
}
for i := 0; i < 3; i++ {
w.wg.Add(1)
go func(i int) {
defer w.wg.Done()
w.jobs <- i
}(i)
}
// wait in the background so that i can move to line 34 and start consuming my job queue
go func() {
w.wg.Wait()
close(w.jobs)
}()
for job := range w.jobs {
log.Println("Got job, I should do something with it", job)
}
}
This was I was looking for. I have a number generator in an infinite while loop. And the program exits on some condition, in this example, it is on the j value, but it can also be something else.
https://play.golang.org/p/Ud4etTjrmx
package main
import "fmt"
func jobs(job chan int) {
i := 1
for {
job <- i
i++
}
}
func main() {
jobsChan := make(chan int, 5)
done := false
j := 0
go jobs(jobsChan)
for !done {
j = <-jobsChan
if j < 20 {
fmt.Printf("job %d\n", j)
} else {
done = true
}
}
}

Channel deadlock on workerpool

I am playing around with channels by making a workerpool of a 1000 workers. Currently I am getting the following error:
fatal error: all goroutines are asleep - deadlock!
Here is my code:
package main
import "fmt"
import "time"
func worker(id int, jobs <-chan int, results chan<- int) {
for j := range jobs {
fmt.Println("worker", id, "started job", j)
time.Sleep(time.Second)
fmt.Println("worker", id, "finished job", j)
results <- j * 2
}
}
func main() {
jobs := make(chan int, 100)
results := make(chan int, 100)
for w := 1; w <= 1000; w++ {
go worker(w, jobs, results)
}
for j := 1; j < 1000000; j++ {
jobs <- j
}
close(jobs)
fmt.Println("==========CLOSED==============")
for i:=0;i<len(results);i++ {
<-results
}
}
Why is this happening? I am still new to go and I am hoping to make sense of this.
While Thomas' answer is basically correct, I post my version which is IMO better Go and also works with unbuffered channels:
func main() {
jobs := make(chan int)
results := make(chan int)
var wg sync.WaitGroup
// you could init the WaitGroup's count here with one call but this is error
// prone - if you change the loop's size you could forget to change the
// WG's count. So call wg.Add in loop
//wg.Add(1000)
for w := 1; w <= 1000; w++ {
wg.Add(1)
go func() {
defer wg.Done()
worker(w, jobs, results)
}()
}
go func() {
for j := 1; j < 2000; j++ {
jobs <- j
}
close(jobs)
fmt.Println("==========CLOSED==============")
}()
// in this gorutine we wait until all "producer" routines are done
// then close the results channel so that the consumer loop stops
go func() {
wg.Wait()
close(results)
}()
for i := range results {
fmt.Print(i, " ")
}
fmt.Println("==========DONE==============")
}
The problem is that your channels are filling up. The main() routine tries to put all jobs into the jobs channel before reading any results. But the results channel only has space for 100 results before any write to the channel will block, so all the workers will eventually block waiting for space in this channel – space that will never come, because main() has not started reading from results yet.
To quickly fix this, you can either make jobs big enough to hold all jobs, so the main() function can continue to the reading phase; or you can make results big enough to hold all results, so the workers can output their results without blocking.
A nicer approach is to make another goroutine to fill up the jobs queue, so main() can go straight to reading results:
func main() {
jobs := make(chan int, 100)
results := make(chan int, 100)
for w := 1; w <= 1000; w++ {
go worker(w, jobs, results)
}
go func() {
for j := 1; j < 1000000; j++ {
jobs <- j
}
close(jobs)
fmt.Println("==========CLOSED==============")
}
for i := 1; i < 1000000; i++ {
<-results
}
}
Note that I had to change the final for loop to a fixed number of iterations, otherwise it might terminate before all the results have been read.
The following code:
for j := 1; j < 1000000; j++ {
jobs <- j
}
should run in a separate goroutine, since all the workers will block waiting for the main gorourine to receive on the results channel, while the main goroutine is stuck in the loop.
package main
import (
"fmt"
"sync"
"time"
)
func worker(id int, jobs <-chan int, results chan<- int, wg *sync.WaitGroup) {
defer wg.Done()
for j := range jobs {
fmt.Println("worker", id, "started job", j)
time.Sleep(time.Millisecond * time.Duration(10))
fmt.Println("worker", id, "finished job", j)
results <- j * 2
}
}
func main() {
jobs := make(chan int, 100)
results := make(chan int, 100)
wg := new(sync.WaitGroup)
wg.Add(1000)
for w := 1; w <= 1000; w++ {
go worker(w, jobs, results, wg)
}
go func() {
wg.Wait()
close(results)
}()
go func() {
for j := 1; j < 1000000; j++ {
jobs <- j
}
close(jobs)
}()
sum := 0
for v := range results {
sum += v
}
fmt.Println("==========CLOSED==============")
fmt.Println("sum", sum)
}

Why does this golang program create a memory leak?

I am trying to understand concurrency and goroutines, and had a couple questions about the following experimental code:
Why does it create a memory leak? I thought that a return at the end of the goroutine would allow memory associated with it to get cleaned up.
Why do my loops almost never reach 999? In fact, when I output to a file and study the output, I notice that it rarely prints integers in double digits; the first time it prints "99" is line 2461, and for "999" line 6120. This behavior is unexpected to me, which clearly means I don't really understand what is going on with goroutine scheduling.
Disclaimer:
Be careful running the code below, it can crash your system if you don't stop it after a few seconds!
CODE
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
for {
// spawn four worker goroutines
spawnWorkers(4, wg)
// wait for the workers to finish
wg.Wait()
}
}
func spawnWorkers(max int, wg sync.WaitGroup) {
for n := 0; n < max; n++ {
wg.Add(1)
go func() {
defer wg.Done()
f(n)
return
}()
}
}
func f(n int) {
for i := 0; i < 1000; i++ {
fmt.Println(n, ":", i)
}
}
Thanks to Tim Cooper, JimB, and Greg for their helpful comments. The corrected version of the code is posted below for reference.
The two fixes were to pass in the WaitGroup by reference, which fixed the memory leak, and to pass n correctly into the anonymous goroutine, and
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
for {
// spawn four worker goroutines
spawnWorkers(4,&wg)
// wait for the workers to finish
wg.Wait()
}
}
func spawnWorkers(max int, wg *sync.WaitGroup) {
for n := 0; n < max; n++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
f(n)
return
}(n)
}
}
func f(n int) {
for i := 0; i < 1000; i++ {
fmt.Println(n, ":", i)
}
}

Resources