I'm trying to make simple worker pool in Go.
After adding the wait group to the following program I'm facing deadlock.
What is the core reason behind it?
When I'm not using the wait group, program seems to be working fine.
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [semacquire]:
sync.runtime_Semacquire(0xc0001b2ea8)
Program -
package main
import (
"fmt"
"strconv"
"sync"
)
func main() {
workerSize := 2
ProcessData(workerSize)
}
// ProcessData :
func ProcessData(worker int) {
// Create Jobs Pool for passong jobs to worker
JobChan := make(chan string)
//Produce the jobs
var jobsArr []string
for i := 1; i <= 10000; i++ {
jobsArr = append(jobsArr, "Test "+strconv.Itoa(i))
}
//Assign jobs to worker from jobs pool
var wg sync.WaitGroup
for w := 1; w <= worker; w++ {
wg.Add(1)
// Consumer
go func(jw int, wg1 *sync.WaitGroup) {
defer wg1.Done()
for job := range JobChan {
actualProcess(job, jw)
}
}(w, &wg)
}
// Create jobs pool
for _, job := range jobsArr {
JobChan <- job
}
wg.Wait()
//close(JobChan)
}
func actualProcess(job string, worker int) {
fmt.Println("WorkerID: #", worker, ", Job Value: ", job)
}
Once all the jobs are consumed, your workers will be waiting in for job := range JobChan for more data. The loop will not finish until the channel is closed.
On the other hand, your main goroutine is waiting for wg.Wait() and does not reach the (commented out) close.
At this point, all goroutines are stuck waiting either for data or for the waitgroup to be done.
The simplest solution is to call close(JobChan) directly after sending all jobs to the channel:
// Create jobs pool
for _, job := range jobsArr {
JobChan <- job
}
close(JobChan)
wg.Wait()
This is slightly modified but a more advanced version of your implementation. I have commented out the code well so that it's easy to understand. So now you can configure the number of jobs and number of works. And even see how the jobs are distributed among workers so that have averagely almost equal amount of work.
package main
import (
"fmt"
)
func main() {
var jobsCount = 10000 // Number of jobs
var workerCount = 2 // Number of workers
processData(workerCount, jobsCount)
}
func processData(workers, numJobs int) {
var jobsArr = make([]string, 0, numJobs)
// jobArr with nTotal jobs
for i := 0; i < numJobs; i++ {
// Fill in jobs
jobsArr = append(jobsArr, fmt.Sprintf("Test %d", i+1))
}
var jobChan = make(chan string, 1)
defer close(jobChan)
var (
// Length of jobsArr
length = len(jobsArr)
// Calculate average chunk size
chunks = len(jobsArr) / workers
// Window Start Index
wStart = 0
// Window End Index
wEnd = chunks
)
// Split the job between workers. Every workers gets a chunk of jobArr
// to work on. Distribution is work is approximately equal because last
// worker can less or more work as well.
for i := 1; i <= workers; i++ {
// Spawn a goroutine for every worker for chunk i.e., jobArr[wStart:wEnd]
go func(wrk, s, e int) {
for j := s; j < e; j++ {
// Do some actual work. Send the actualProcess's return value to
// jobChan
jobChan <- actualProcess(wrk, jobsArr[j])
}
}(i, wStart, wEnd)
// Change pointers to get the set of chunk in next iteration
wStart = wEnd
wEnd += chunks
if i == workers-1 {
// If next worker is the last worker,
// do till the end
wEnd = length
}
}
for i := 0; i < numJobs; i++ {
// Receieve all jobs
fmt.Println(<-jobChan)
}
}
func actualProcess(worker int, job string) string {
return fmt.Sprintf("WorkerID: #%d, Job Value: %s", worker, job)
}
Related
Good day,
I'm trying to implement the correct delay between the execution of workers, for example, it is necessary for the workers to complete 30 tasks and go to sleep for 5 seconds, how can I track in the code that exactly 30 tasks have been completed and only after that go to sleep for 5 seconds?
Below is the code that creates a pool of 30 workers, who, in turn, perform tasks of 30 pieces at a time in an unordered manner, here is the code:
import (
"fmt"
"math/rand"
"sync"
"time"
)
type Job struct {
id int
randomno int
}
type Result struct {
job Job
sumofdigits int
}
var jobs = make(chan Job, 10)
var results = make(chan Result, 10)
func digits(number int) int {
sum := 0
no := number
for no != 0 {
digit := no % 10
sum += digit
no /= 10
}
time.Sleep(2 * time.Second)
return sum
}
func worker(wg *sync.WaitGroup) {
for job := range jobs {
output := Result{job, digits(job.randomno)}
results <- output
}
wg.Done()
}
func createWorkerPool(noOfWorkers int) {
var wg sync.WaitGroup
for i := 0; i < noOfWorkers; i++ {
wg.Add(1)
go worker(&wg)
}
wg.Wait()
close(results)
}
func allocate(noOfJobs int) {
for i := 0; i < noOfJobs; i++ {
if i != 0 && i%30 == 0 {
fmt.Printf("SLEEPAGE 5 sec...")
time.Sleep(10 * time.Second)
}
randomno := rand.Intn(999)
job := Job{i, randomno}
jobs <- job
}
close(jobs)
}
func result(done chan bool) {
for result := range results {
fmt.Printf("Job id %d, input random no %d , sum of digits %d\n", result.job.id, result.job.randomno, result.sumofdigits)
}
done <- true
}
func main() {
startTime := time.Now()
noOfJobs := 100
go allocate(noOfJobs)
done := make(chan bool)
go result(done)
noOfWorkers := 30
createWorkerPool(noOfWorkers)
<-done
endTime := time.Now()
diff := endTime.Sub(startTime)
fmt.Println("total time taken ", diff.Seconds(), "seconds")
}
play: https://go.dev/play/p/lehl7hoo-kp
It is not clear exactly how to make sure that 30 tasks are completed and where to insert the delay, I will be grateful for any help
Okey, so let's start with this working example:
func Test_t(t *testing.T) {
// just a published, this publishes result on a chan
publish := func(s int, ch chan int, wg *sync.WaitGroup) {
ch <- s // this is blocking!!!
wg.Done()
}
wg := &sync.WaitGroup{}
wg.Add(100)
// we'll use done channel to notify the work is done
res := make(chan int)
done := make(chan struct{})
// create worker that will notify that all results were published
go func() {
wg.Wait()
done <- struct{}{}
}()
// let's create a jobs that publish on our res chan
// please note all goroutines are created immediately
for i := 0; i < 100; i++ {
go publish(i, res, wg)
}
// lets get 30 args and then wait
var resCounter int
forloop:
for {
select {
case ss := <-res:
println(ss)
resCounter += 1
// break the loop
if resCounter%30 == 0 {
// after receiving 30 results we are blocking this thread
// no more results will be taken from the channel for 5 seconds
println("received 30 results, waiting...")
time.Sleep(5 * time.Second)
}
case <-done:
// we are done here, let's break this infinite loop
break forloop
}
}
}
I hope this shows moreover how it can be done.
So, what's the problem with your code?
To be honest, it looks fine (I mean 30 results are published, then the code wait, then another 30 results, etc.), but the question is where would you like to wait?
There are a few possibilities I guess:
creating workers (this is how your code works now, as I see, it publishes jobs in 30-packs; please notice that the 2-second delay you have in the digit function is applicable only to the goroutine the code is executed)
triggering workers (so the "wait" code should be in worker function, not allowing to run more workers - so it must watch how many results were published)
handling results (this is how my code works and proper synchronization is in the forloop)
The code below starts a few workers. Each worker receives a value via a channel which is added to a map where the key is the worker ID and value is the number received. Finally, when I add all the values received, I should get an expected result (in this case 55 because that is what you get when you add from 1..10). In most cases, I am not seeing the expected output. What am I doing wrong here? I do not want to solve it by adding a sleep. I would like to identify the issue programmatically and fix it.
type counter struct {
value int
count int
}
var data map[string]counter
var lock sync.Mutex
func adder(wid string, n int) {
defer lock.Unlock()
lock.Lock()
d := data[wid]
d.count++
d.value += n
data[wid] = d
return
}
func main() {
fmt.Println(os.Getpid())
data = make(map[string]counter)
c := make(chan int)
for w := 1; w <= 3; w++ { //starting 3 workers here
go func(wid string) {
data[wid] = counter{}
for {
v, k := <-c
if !k {
continue
}
adder(wid, v)
}
}(strconv.Itoa(w)) // worker is given an ID
}
time.Sleep(1 * time.Second) // If this is not added, only one goroutine is recorded.
for i := 1; i <= 10; i++ {
c <- i
}
close(c)
total := 0
for i, v := range data {
fmt.Println(i, v)
total += v.value
}
fmt.Println(total)
}
Your code has two significant races:
The initialization of data[wid] = counter{} is not synchronized with other goroutines that may be reading and rewriting data.
The worker goroutines do not signal when they are done modifying data, which means your main goroutine may read data before they finish writing.
You also have a strange construct:
for {
v, k := <-c
if !k {
continue
}
adder(wid, v)
}
k will only be false when the channel c is closed, after which the goroutine spins as much as it can. This would be better written as for v := range c.
To fix the reading code in the main goroutine, we'll use the more normal for ... range c idiom and add a sync.WaitGroup, and have each worker invoke Done() on the wait-group. The main goroutine will then wait for them to finish. To fix the initialization, we'll lock the map (there are other ways to do this, e.g., to set up the map before starting any of the goroutines, or to rely on the fact that empty map slots read as zero, but this one is straightforward). I also took out the extra debug. The result is this code, also available on the Go Playground.
package main
import (
"fmt"
// "os"
"strconv"
"sync"
// "time"
)
type counter struct {
value int
count int
}
var data map[string]counter
var lock sync.Mutex
var wg sync.WaitGroup
func adder(wid string, n int) {
defer lock.Unlock()
lock.Lock()
d := data[wid]
d.count++
d.value += n
data[wid] = d
}
func main() {
// fmt.Println(os.Getpid())
data = make(map[string]counter)
c := make(chan int)
for w := 1; w <= 3; w++ { //starting 3 workers here
wg.Add(1)
go func(wid string) {
lock.Lock()
data[wid] = counter{}
lock.Unlock()
for v := range c {
adder(wid, v)
}
wg.Done()
}(strconv.Itoa(w)) // worker is given an ID
}
for i := 1; i <= 10; i++ {
c <- i
}
close(c)
wg.Wait()
total := 0
for i, v := range data {
fmt.Println(i, v)
total += v.value
}
fmt.Println(total)
}
(This can be improved easily, e.g., there's no reason for wg to be global.)
Well, I like #torek's answer but I wanted to post this answer as it contains a bunch of improvements:
Reduce the usage of locks (For such simple tasks, avoid locks. If you benchmark it, you'll notice a good difference because my code uses the lock only numworkers times).
Improve the naming of variables.
Remove usage of global vars (Use of global vars should always be as minimum as possible).
The following code adds a number from minWork to maxWork using numWorker spawned goroutines.
package main
import (
"fmt"
"sync"
)
const (
bufferSize = 1 // Buffer for numChan
numworkers = 3 // Number of workers doing addition
minWork = 1 // Sum from [minWork] (inclusive)
maxWork = 10000000 // Sum upto [maxWork] (inclusive)
)
// worker stats
type worker struct {
workCount int // Number of times, worker worked
workDone int // Amount of work done; numbers added
}
// workerMap holds a map for worker(s)
type workerMap struct {
mu sync.Mutex // Guards m for safe, concurrent r/w
m map[int]worker // Map to hold worker id to worker mapping
}
func main() {
var (
totalWorkDone int // Total Work Done
wm workerMap // WorkerMap
wg sync.WaitGroup // WaitGroup
numChan = make(chan int, bufferSize) // Channel for nums
)
wm.m = make(map[int]worker, numworkers)
for wid := 0; wid < numworkers; wid++ {
wg.Add(1)
go func(id int) {
var wk worker
// Wait for numbers
for n := range numChan {
wk.workCount++
wk.workDone += n
}
// Fill worker stats
wm.mu.Lock()
wm.m[id] = wk
wm.mu.Unlock()
wg.Done()
}(wid)
}
// Send numbers for addition by multiple workers
for i := minWork; i <= maxWork; i++ {
numChan <- i
}
// Close the channel
close(numChan)
// Wait for goroutines to finish
wg.Wait()
// Print stats
for k, v := range wm.m {
fmt.Printf("WorkerID: %d; Work: %+v\n", k, v)
totalWorkDone += v.workDone
}
// Print total work done by all workers
fmt.Printf("Work Done: %d\n", totalWorkDone)
}
I have a worker pool listening on a jobs channel, and responding on a results channel.
The jobs producer must run on a fixed ticker interval. Results must be flushed before reading just enough new jobs to fill up the buffer. It's critical to flush results, and read new jobs, in batches.
See example code below, run it on the playground here.
Is it possible to rewrite this without an atomic counter for keeping track of inflight jobs?
// Worker pool with buffered jobs and fixed polling interval
package main
import (
"fmt"
"math/rand"
"os"
"os/signal"
"strings"
"sync"
"sync/atomic"
"syscall"
"time"
)
func main() {
rand.Seed(time.Now().UnixNano())
// buf is the size of the jobs buffer
buf := 5
// workers is the number of workers to start
workers := 3
// jobs chan for workers
jobs := make(chan int, buf)
// results chan for workers
results := make(chan int, buf*2)
// jobID is incremented for each job sent on the jobs chan
var jobID int
// inflight is a count of the items in the jobs chan buffer
var inflight uint64
// pollInterval for jobs producer
pollInterval := 500 * time.Millisecond
// pollDone chan to stop polling
pollDone := make(chan bool)
// jobMultiplier on pollInterval for random job processing times
jobMultiplier := 5
// done chan to exit program
done := make(chan bool)
// Start workers
wg := sync.WaitGroup{}
for n := 0; n < workers; n++ {
wg.Add(1)
go (func(n int) {
defer wg.Done()
for {
// Receive from channel or block
jobID, more := <-jobs
if more {
// To subtract a signed positive constant value...
// https://golang.org/pkg/sync/atomic/#AddUint64
c := atomic.AddUint64(&inflight, ^uint64(0))
fmt.Println(
fmt.Sprintf("worker %v processing %v - %v jobs left",
n, jobID, c))
// Processing the job...
m := rand.Intn(jobMultiplier)
time.Sleep(time.Duration(m) * pollInterval)
results <- jobID
} else {
fmt.Println(fmt.Sprintf("worker %v exited", n))
return
}
}
})(n)
}
// Signal to exit
sig := make(chan os.Signal, 1)
signal.Notify(sig, syscall.SIGINT, syscall.SIGTERM)
fmt.Println("ctrl+c to exit")
go (func() {
ticker := time.NewTicker(pollInterval)
r := make([]string, 0)
flushResults := func() {
fmt.Println(
fmt.Sprintf("===> results: %v", strings.Join(r, ",")))
r = make([]string, 0)
}
for {
select {
case <-ticker.C:
flushResults()
// Fetch jobs
c := atomic.LoadUint64(&inflight)
d := uint64(buf) - c
for i := 0; i < int(d); i++ {
jobID++
jobs <- jobID
atomic.AddUint64(&inflight, 1)
}
fmt.Println(fmt.Sprintf("===> send %v jobs", d))
case jobID := <-results:
r = append(r, fmt.Sprintf("%v", jobID))
case <-pollDone:
// Stop polling for new jobs
ticker.Stop()
// Close jobs channel to stop workers
close(jobs)
// Wait for workers to exit
wg.Wait()
close(results)
// Flush remaining results
for {
jobID, more := <-results
if more {
r = append(r, fmt.Sprintf("%v", jobID))
} else {
break
}
}
flushResults()
// Done!
done <- true
return
}
}
})()
// Wait for exit signal
<-sig
fmt.Println("---------| EXIT |---------")
pollDone <- true
<-done
fmt.Println("...done")
}
Here is a channel-based version of your code, functionally equivalent to the intent of the example above. The key points is that we're not using any atomic values to vary the logic of the code, because that offers no synchronization between the goroutines. All interactions between the goroutines are synchronized using channels, sync.WaitGroup, or context.Context. There are probably better ways to solve the problem at hand, but this demonstrates that there are no atomics necessary to coordinate the queue and workers.
The only value that is still left uncoordinated between goroutines here is the use of len(jobs) in the log output. Whether it makes sense to use it or not is up to you, as its value is meaningless in a concurrent world, but it's safe because it's synchronized for concurrent use and there is no logic based on the value.
buf := 5
workers := 3
jobs := make(chan int, buf)
// results buffer must always be larger than workers + buf to prevent deadlock
results := make(chan int, buf*2)
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
// Start workers
var wg sync.WaitGroup
for n := 0; n < workers; n++ {
wg.Add(1)
go func(n int) {
defer wg.Done()
for jobID := range jobs {
fmt.Printf("worker %v processing %v - %v jobs left\n", n, jobID, len(jobs))
time.Sleep(time.Duration(rand.Intn(5)) * pollInterval)
results <- jobID
}
fmt.Printf("worker %v exited", n)
}(n)
}
var done sync.WaitGroup
done.Add(1)
go func() {
defer done.Done()
ticker := time.NewTicker(pollInterval)
r := make([]string, 0)
flushResults := func() {
fmt.Printf("===> results: %v\n", strings.Join(r, ","))
r = r[:0]
}
for {
select {
case <-ticker.C:
flushResults()
// send max buf jobs, or fill the queue
for i := 0; i < buf; i++ {
jobID++
select {
case jobs <- jobID:
continue
}
break
}
fmt.Printf("===> send %v jobs\n", i)
case jobID := <-results:
r = append(r, fmt.Sprintf("%v", jobID))
case <-ctx.Done():
// Close jobs channel to stop workers
close(jobs)
// Wait for workers to exit
wg.Wait()
// we can close results for easy iteration because we know
// there are no more workers.
close(results)
// Flush remaining results
for jobID := range results {
r = append(r, fmt.Sprintf("%v", jobID))
}
flushResults()
return
}
}
}()
I'm thinking start 1000 goroutines at the same time using for loop in Golang.
The problem is: I have to make sure that every goroutine has been executed.
Is it possible using channels to help me make sure of that?
The structure is kinda like this:
func main {
for i ... {
go ...
ch?
ch?
}
As #Andy mentioned You can use sync.WaitGroup to achieve this. Below is an example. Hope the code is self-explanatory.
package main
import (
"fmt"
"sync"
"time"
)
func dosomething(millisecs int64, wg *sync.WaitGroup) {
defer wg.Done()
duration := time.Duration(millisecs) * time.Millisecond
time.Sleep(duration)
fmt.Println("Function in background, duration:", duration)
}
func main() {
arr := []int64{200, 400, 150, 600}
var wg sync.WaitGroup
for _, n := range arr {
wg.Add(1)
go dosomething(n, &wg)
}
wg.Wait()
fmt.Println("Done")
}
To make sure the goroutines are done and collect the results, try this example:
package main
import (
"fmt"
)
const max = 1000
func main() {
for i := 1; i <= max; i++ {
go f(i)
}
sum := 0
for i := 1; i <= max; i++ {
sum += <-ch
}
fmt.Println(sum) // 500500
}
func f(n int) {
// do some job here and return the result:
ch <- n
}
var ch = make(chan int, max)
In order to wait for 1000 goroutines to finish, try this example:
package main
import (
"fmt"
"sync"
)
func main() {
wg := &sync.WaitGroup{}
for i := 0; i < 1000; i++ {
wg.Add(1)
go f(wg, i)
}
wg.Wait()
fmt.Println("Done.")
}
func f(wg *sync.WaitGroup, n int) {
defer wg.Done()
fmt.Print(n, " ")
}
I would suggest that you follow a pattern. Concurrency and Channel is Good but if you use it in a bad way, your program might became even slower than expected. The simple way to handle multiple go-routine and channel is by a worker pool pattern.
Take a close look at the code below
// In this example we'll look at how to implement
// a _worker pool_ using goroutines and channels.
package main
import "fmt"
import "time"
// Here's the worker, of which we'll run several
// concurrent instances. These workers will receive
// work on the `jobs` channel and send the corresponding
// results on `results`. We'll sleep a second per job to
// simulate an expensive task.
func worker(id int, jobs <-chan int, results chan<- int) {
for j := range jobs {
fmt.Println("worker", id, "started job", j)
time.Sleep(time.Second)
fmt.Println("worker", id, "finished job", j)
results <- j * 2
}
}
func main() {
// In order to use our pool of workers we need to send
// them work and collect their results. We make 2
// channels for this.
jobs := make(chan int, 100)
results := make(chan int, 100)
// This starts up 3 workers, initially blocked
// because there are no jobs yet.
for w := 1; w <= 3; w++ {
go worker(w, jobs, results)
}
// Here we send 5 `jobs` and then `close` that
// channel to indicate that's all the work we have.
for j := 1; j <= 5; j++ {
jobs <- j
}
close(jobs)
// Finally we collect all the results of the work.
for a := 1; a <= 5; a++ {
<-results
}
}
This simple example is taken from here . Also the results channel can help you keep track of all the go routines executing the jobs including failure notice.
// _Closing_ a channel indicates that no more values
// will be sent on it. This can be useful to communicate
// completion to the channel's receivers.
package main
import "fmt"
// In this example we'll use a `jobs` channel to
// communicate work to be done from the `main()` goroutine
// to a worker goroutine. When we have no more jobs for
// the worker we'll `close` the `jobs` channel.
func main() {
jobs := make(chan int, 5)
done := make(chan bool)
// Here's the worker goroutine. It repeatedly receives
// from `jobs` with `j, more := <-jobs`. In this
// special 2-value form of receive, the `more` value
// will be `false` if `jobs` has been `close`d and all
// values in the channel have already been received.
// We use this to notify on `done` when we've worked
// all our jobs.
for i := 1; i <= 3; i++ {
go func() {
for {
j, more := <-jobs
if more {
fmt.Println("received job", j)
} else {
fmt.Println("received all jobs")
done <- true
return
}
}
}()
}
// This sends 3 jobs to the worker over the `jobs`
// channel, then closes it.
j := 0
for {
j++
jobs <- j
fmt.Println("sent job", j)
}
close(jobs)
fmt.Println("sent all jobs")
// We await the worker using the
// [synchronization](channel-synchronization) approach
// we saw earlier.
<-done
}
https://play.golang.org/p/x28R_g8ftS
What I'm trying to do is get all the responses from a paginated url endpoint. jobs is a channel storing the page number. I have a function in if more{} checking for empty reponse and I have
done <- true
return
I thought this would close the go routine.
But, the page generator for{j++; jobs <- j} is causing it to get stuck in a loop. Any idea how this can be resolved?
By definition a for loop without conditions is an infinite loop. Unless you put some logic to break this infinite loop, you'll never get out of it.
In your playground your comment implies that you want to send 3 jobs. You should change your for loop accordingly:
for j := 0; j < 3; j++ {
jobs <- j
fmt.Println("sent job", j)
}
This is a simplified version of a worker.. Its not very useful for production level traffic, but should serve as a simple example, there are tons of them :-)
package main
import (
"log"
"sync"
)
type worker struct {
jobs chan int
wg *sync.WaitGroup
}
func main() {
w := worker{
jobs: make(chan int, 5), // I only want to work on 5 jobs at any given time
wg: new(sync.WaitGroup),
}
for i := 0; i < 3; i++ {
w.wg.Add(1)
go func(i int) {
defer w.wg.Done()
w.jobs <- i
}(i)
}
// wait in the background so that i can move to line 34 and start consuming my job queue
go func() {
w.wg.Wait()
close(w.jobs)
}()
for job := range w.jobs {
log.Println("Got job, I should do something with it", job)
}
}
This was I was looking for. I have a number generator in an infinite while loop. And the program exits on some condition, in this example, it is on the j value, but it can also be something else.
https://play.golang.org/p/Ud4etTjrmx
package main
import "fmt"
func jobs(job chan int) {
i := 1
for {
job <- i
i++
}
}
func main() {
jobsChan := make(chan int, 5)
done := false
j := 0
go jobs(jobsChan)
for !done {
j = <-jobsChan
if j < 20 {
fmt.Printf("job %d\n", j)
} else {
done = true
}
}
}