the below code that get worker from channel and execute function "call", all routines finish and print that they are done but wait never finishes,
i traced the counter of WaitGroup by making varible counter incresing when add to wg and decresing when done and it was zero at the end of for loop
any help please
package mapreduce
import (
"fmt"
"sync"
)
func schedule(jobName string, mapFiles []string, nReduce int, phase jobPhase, registerChan chan string) {
var ntasks int
var n_other int // number of inputs (for reduce) or outputs (for map)
switch phase {
case mapPhase:
ntasks = len(mapFiles)
n_other = nReduce
case reducePhase:
ntasks = nReduce
n_other = len(mapFiles)
}
fmt.Printf("Schedule: %v %v tasks (%d I/Os)\n", ntasks, phase, n_other)
var wg sync.WaitGroup
for i := 0; i < ntasks; i++ {
worker := <-registerChan
doTaskArg := DoTaskArgs{jobName, mapFiles[i], phase, i, n_other}
wg.Add(1)
go func() {
defer wg.Done()
done := call(worker, "Worker.DoTask", doTaskArg, nil)
if done {
registerChan <- worker
} else {
i = i - 1
}
}()
}
wg.Wait()
fmt.Printf("Schedule: %v phase done\n", phase)
}
The channel is blocking your goroutine. If you put some data into a unbuffered channel the goroutine waits until the receiver gets the the data from the channel. In your case your routine blocks at register <- worker and defer wg.Done() is never called, because the function is waiting.
Related
Unable to find a reason as to why this code deadlocks. The aim here is make the worker go routines do some work only after they are signaled.
If the signalStream channel is removed from the code, it works fine. But when this is introduced, it deadlocks. Not sure the reason for this. Also if there are any tools to explain the occurrence of deadlock, it would help.
package main
import (
"log"
"sync"
)
const jobs = 10
const workers = 5
var wg sync.WaitGroup
func main() {
// create channel
dataStream := make(chan interface{})
signalStream := make(chan interface{})
// Generate workers
for i := 1; i <= workers; i++ {
wg.Add(1)
go worker(dataStream, signalStream, i*100)
}
// Generate jobs
for i := 1; i <= jobs; i++ {
dataStream <- i
}
close(dataStream)
// start consuming data
close(signalStream)
wg.Wait()
}
func worker(c <-chan interface{}, s <-chan interface{}, id int) {
defer wg.Done()
<-s
for i := range c {
log.Printf("routine - %d - %d \n", id, i)
}
}
Generate the jobs in a separate gorouine, i.e. put the whole jobs loop into a goroutine. If you don't then dataStream <- i will block and your program will never "start consuming data"
// Generate jobs
go func() {
for i := 1; i <= jobs; i++ {
dataStream <- i
}
close(dataStream)
}()
https://go.dev/play/p/ChlbsJlgwdE
In the code hereunder, I don't understand why the "Worker" methods seem to exit instead of pulling values from the input channel "in" and processing them.
I had assumed they would only return after having consumed all input from the input channel "in" and processing them
package main
import (
"fmt"
"sync"
)
type ParallelCallback func(chan int, chan Result, int, *sync.WaitGroup)
type Result struct {
i int
val int
}
func Worker(in chan int, out chan Result, id int, wg *sync.WaitGroup) {
for item := range in {
item *= item // returns the square of the input value
fmt.Printf("=> %d: %d\n", id, item)
out <- Result{item, id}
}
wg.Done()
fmt.Printf("%d exiting ", id)
}
func Run_parallel(n_workers int, in chan int, out chan Result, Worker ParallelCallback) {
wg := sync.WaitGroup{}
for id := 0; id < n_workers; id++ {
fmt.Printf("Starting : %d\n", id)
wg.Add(1)
go Worker(in, out, id, &wg)
}
wg.Wait() // wait for all workers to complete their tasks
close(out) // close the output channel when all tasks are completed
}
const (
NW = 4
)
func main() {
in := make(chan int)
out := make(chan Result)
go func() {
for i := 0; i < 100; i++ {
in <- i
}
close(in)
}()
Run_parallel(NW, in, out, Worker)
for item := range out {
fmt.Printf("From out : %d: %d", item.i, item.val)
}
}
The output is
Starting : 0
Starting : 1
Starting : 2
Starting : 3
=> 3: 0
=> 0: 1
=> 1: 4
=> 2: 9
fatal error: all goroutines are asleep - deadlock!
fatal error: all goroutines are asleep - deadlock!
The full error shows where each goroutine is "stuck". If you run this in the playground, it will even show you the line number. That made it easy for me to diagnose.
Your Run_parallel runs in the main groutine, so before main can read from out, Run_parallel must return. Before Run_parallel can return, it must wg.Wait(). But before the workers call wg.Done(), they must write to out. That's what causes a deadlock.
One solution is simple: just run Run_parallel concurrently in its own Goroutine.
go Run_parallel(NW, in, out, Worker)
Now, main ranges over out, waiting on outs closure to signal completion. Run_parallel waits for the workers with wg.Wait(), and the workers will range over in. All the work will get done, and the program won't end until it's all done. (https://go.dev/play/p/oMrgH2U09tQ)
Solution :
Run_parallel has to run in it’s own goroutine:
package main
import (
"fmt"
"sync"
)
type ParallelCallback func(chan int, chan Result, int, *sync.WaitGroup)
type Result struct {
id int
val int
}
func Worker(in chan int, out chan Result, id int, wg *sync.WaitGroup) {
defer wg.Done()
for item := range in {
item *= 2 // returns the double of the input value (Bogus handling of data)
out <- Result{id, item}
}
}
func Run_parallel(n_workers int, in chan int, out chan Result, Worker ParallelCallback) {
wg := sync.WaitGroup{}
for id := 0; id < n_workers; id++ {
wg.Add(1)
go Worker(in, out, id, &wg)
}
wg.Wait() // wait for all workers to complete their tasks
close(out) // close the output channel when all tasks are completed
}
const (
NW = 8
)
func main() {
in := make(chan int)
out := make(chan Result)
go func() {
for i := 0; i < 10; i++ {
in <- i
}
close(in)
}()
go Run_parallel(NW, in, out, Worker)
for item := range out {
fmt.Printf("From out [%d]: %d\n", item.id, item.val)
}
println("- - - All done - - -")
}
Alternative formulation of the solution:
In that alternative formulation , it is not necessary to start Run_parallel as a goroutine (it triggers its own goroutine).
I prefer that second solution, because it automates the fact that Run_parallel() has to run parallel to the main function. Also, for the same reason it's safer, less error-prone (no need to remember to run Run_parallel with the go keyword).
package main
import (
"fmt"
"sync"
)
type ParallelCallback func(chan int, chan Result, int, *sync.WaitGroup)
type Result struct {
id int
val int
}
func Worker(in chan int, out chan Result, id int, wg *sync.WaitGroup) {
defer wg.Done()
for item := range in {
item *= 2 // returns the double of the input value (Bogus handling of data)
out <- Result{id, item}
}
}
func Run_parallel(n_workers int, in chan int, out chan Result, Worker ParallelCallback) {
go func() {
wg := sync.WaitGroup{}
defer close(out) // close the output channel when all tasks are completed
for id := 0; id < n_workers; id++ {
wg.Add(1)
go Worker(in, out, id, &wg)
}
wg.Wait() // wait for all workers to complete their tasks *and* trigger the -differed- close(out)
}()
}
const (
NW = 8
)
func main() {
in := make(chan int)
out := make(chan Result)
go func() {
defer close(in)
for i := 0; i < 10; i++ {
in <- i
}
}()
Run_parallel(NW, in, out, Worker)
for item := range out {
fmt.Printf("From out [%d]: %d\n", item.id, item.val)
}
println("- - - All done - - -")
}
Suppose there are a maximum two elements (worker addresses) on registerChan at any point. Then for some reason, the following code does not call wg.Done() in the last two goroutines.
func schedule(jobName string, mapFiles []string, nReduce int, phase jobPhase, registerChan chan string) {
var ntasks int
var nOther int // number of inputs (for reduce) or outputs (for map)
switch phase {
case mapPhase:
ntasks = len(mapFiles)
nOther = nReduce
case reducePhase:
ntasks = nReduce
nOther = len(mapFiles)
}
fmt.Printf("Schedule: %v %v tasks (%d I/Os)\n", ntasks, phase, nOther)
const rpcname = "Worker.DoTask"
var wg sync.WaitGroup
for taskNumber := 0; taskNumber < ntasks; taskNumber++ {
file := mapFiles[taskNumber%len(mapFiles)]
taskArgs := DoTaskArgs{jobName, file, phase, taskNumber, nOther}
wg.Add(1)
go func(taskArgs DoTaskArgs) {
workerAddr := <-registerChan
print("hello\n")
// _ = call(workerAddr, rpcname, taskArgs, nil)
registerChan <- workerAddr
wg.Done()
}(taskArgs)
}
wg.Wait()
fmt.Printf("Schedule: %v done\n", phase)
}
If I put wg.Done() before registerChan <- workerAddr it works just fine and I have no idea why. I have also tried deferring wg.Done() but that doesn't seem to work even though I expected it to. I think I have some misunderstanding of how go routines and channels work which is causing my confusion.
Because it stopped here:
workerAddr := <-registerChan
For a buffered channel:
To get this workerAddr := <-registerChan to work: the channel registerChan must have a value; otherwise, the code will stop here waiting for the channel.
I managed to run your code this way (try this):
package main
import (
"fmt"
"sync"
)
func main() {
registerChan := make(chan int, 1)
for i := 1; i <= 10; i++ {
wg.Add(1)
go fn(i, registerChan)
}
registerChan <- 0 // seed
wg.Wait()
fmt.Println(<-registerChan)
}
func fn(taskArgs int, registerChan chan int) {
workerAddr := <-registerChan
workerAddr += taskArgs
registerChan <- workerAddr
wg.Done()
}
var wg sync.WaitGroup
Output:
55
Explanation:
This code adds 1 to 10 using a channel and 10 goroutines plus one main goroutine.
I hope this helps.
When you run this statement registerChan <- workerAddr, if the channel capacity is full you cannot add in it and it will block. If you have a pool of, say 10, workerAddr, you could add all of them in a buffered channel of capacity 10 before calling schedule. Do not add after the call, to guarantee that if you take a value from the channel, there is space to add it again after. Using defer at the beginning of your goroutine is good.
Intent:
I am looking for a means to run os-level shell commands in parallel, but want to be careful to not clobber CPU and am wondering if a buffered channel would fit this use case.
Implemented:
Create a series of Jobs with a simulated runtime duration. Send these jobs to a queue which will dispatch them to run over a buffered channel as throttled by EXEC_THROTTLE.
Observations:
This 'works' (to the extent that it compiles and runs), but I am wondering if the buffer is working as specified (see: 'Intent') to throttle the number of processes running in parallel.
Disclaimer:
Now, I am aware that newbies tend to over-use channels, but I feel this request for insight is honest, as I've at least exercised the restraint to use a sync.WaitGroup. Forgive the somewhat toy example, but all insight would be appreciated.
Playground
package main
import (
// "os/exec"
"log"
"math/rand"
"strconv"
"sync"
"time"
)
const (
EXEC_THROTTLE = 2
)
type JobsManifest []Job
type Job struct {
cmd string
result string
runtime int // Simulate long-running task
}
func (j JobsManifest) queueJobs(logChan chan<- string, runChan chan Job, wg *sync.WaitGroup) {
go dispatch(logChan, runChan)
for _, job := range j {
wg.Add(1)
runChan <- job
}
}
func dispatch(logChan chan<- string, runChan chan Job) {
for j := range runChan {
go run(j, logChan)
}
}
func run(j Job, logChan chan<- string) {
time.Sleep(time.Second * time.Duration(j.runtime))
j.result = strconv.Itoa(rand.Intn(10)) // j.result = os.Exec("/bin/bash", "-c", j.cmd).Output()
logChan <- j.result
log.Printf(" ran: %s\n", j.cmd)
}
func logger(logChan <-chan string, wg *sync.WaitGroup) {
for {
res := <-logChan
log.Printf("logged: %s\n", res)
wg.Done()
}
}
func main() {
jobs := []Job{
Job{
cmd: "ps -p $(pgrep vim) | tail -n 1 | awk '{print $3}'",
runtime: 1,
},
Job{
cmd: "wc -l /var/log/foo.log | awk '{print $1}'",
runtime: 2,
},
Job{
cmd: "ls -l ~/go/src/github.com/ | wc -l | awk '{print $1}'",
runtime: 3,
},
Job{
cmd: "find /var/log/ -regextype posix-extended -regex '.*[0-9]{10}'",
runtime: 4,
},
}
var wg sync.WaitGroup
logChan := make(chan string)
runChan := make(chan Job, EXEC_THROTTLE)
go logger(logChan, &wg)
start := time.Now()
JobsManifest(jobs).queueJobs(logChan, runChan, &wg)
wg.Wait()
log.Printf("finish: %s\n", time.Since(start))
}
You can also cap concurrency with buffered channel:
concurrencyLimit := 2 // Number of simultaneous jobs.
semaphore := make(chan struct{}, concurrencyLimit)
for job := range jobs {
job := job // Pin loop variable.
semaphore <- struct{}{} // Acquire semaphore slot.
go func() {
defer func() {
<-semaphore // Release semaphore slot.
}()
do(job) // Do the job.
}()
}
// Wait for goroutines to finish by acquiring all slots.
for i := 0; i < cap(semaphore); i++ {
semaphore <- struct{}{}
}
Replace processItem function with required execution of your job.
Below will execute jobs in proper order. Atmost EXEC_CONCURRENT items will be executed concurrently.
package main
import (
"fmt"
"sync"
"time"
)
func processItem(i int, done chan int, wg *sync.WaitGroup) {
fmt.Printf("Async Start: %d\n", i)
time.Sleep(100 * time.Millisecond * time.Duration(i))
fmt.Printf("Async Complete: %d\n", i)
done <- 1
wg.Done()
}
func popItemFromBufferChannelWhenItemDoneExecuting(items chan int, done chan int) {
_ = <- done
_ = <-items
}
func main() {
EXEC_CONCURRENT := 3
items := make(chan int, EXEC_CONCURRENT)
done := make(chan int)
var wg sync.WaitGroup
for i:= 1; i < 11; i++ {
items <- i
wg.Add(1)
go processItem(i, done, &wg)
go popItemFromBufferChannelWhenItemDoneExecuting(items, done)
}
wg.Wait()
}
Below will execute jobs in Random order. Atmost EXEC_CONCURRENT items will be executed concurrently.
package main
import (
"fmt"
"sync"
"time"
)
func processItem(i int, items chan int, wg *sync.WaitGroup) {
items <- i
fmt.Printf("Async Start: %d\n", i)
time.Sleep(100 * time.Millisecond * time.Duration(i))
fmt.Printf("Async Complete: %d\n", i)
_ = <- items
wg.Done()
}
func main() {
EXEC_CONCURRENT := 3
items := make(chan int, EXEC_CONCURRENT)
var wg sync.WaitGroup
for i:= 1; i < 11; i++ {
wg.Add(1)
go processItem(i, items, &wg)
}
wg.Wait()
}
You can choose according to your requirement.
If I understand you right, you mean to establish a mechanism to ensure that at any time at most a number of EXEC_THROTTLE jobs are running. And if that is your intention, the code does not work.
It is because when you start a job, you have already consumed the channel - allowing another job to be started, yet no jobs have been finished. You can debug this by add an counter (you'll need atomic add or mutex).
You may do the work by simply start a group of goroutine with an unbuffered channel and block when executating jobs:
func Run(j Job) r Result {
//Run your job here
}
func Dispatch(ch chan Job) {
for j:=range ch {
wg.Add(1)
Run(j)
wg.Done()
}
}
func main() {
ch := make(chan Job)
for i:=0; i<EXEC_THROTTLE; i++ {
go Dispatch(ch)
}
//call dispatch according to the queue here.
}
It works because as along as one goroutine is consuming the channel, it means at least one goroutine is not running and there is at most EXEC_THROTTLE-1 jobs running so it is good to execuate one more and it does so.
I use this a lot. https://github.com/dustinevan/go-utils
package async
import (
"context"
"github.com/pkg/errors"
)
type Semaphore struct {
buf chan struct{}
ctx context.Context
cancel context.CancelFunc
}
func NewSemaphore(max int, parentCtx context.Context) *Semaphore {
s := &Semaphore{
buf: make(chan struct{}, max),
ctx: parentCtx,
}
go func() {
<-s.ctx.Done()
close(s.buf)
drainStruct(s.buf)
}()
return s
}
var CLOSED = errors.New("the semaphore has been closed")
func (s *Semaphore) Acquire() error {
select {
case <-s.ctx.Done():
return CLOSED
case s.buf <- struct{}{}:
return nil
}
}
func (s *Semaphore) Release() {
<-s.buf
}
you'd use it like this:
func main() {
sem := async.NewSemaphore(10, context.Background())
...
var wg sync.Waitgroup
for _, job := range jobs {
go func() {
wg.Add(1)
err := sem.Acquire()
if err != nil {
// handle err,
}
defer sem.Release()
defer wg.Done()
job()
}
wg.Wait()
}
I have a goroutine which can generate an infinite number of values (each more suitable than the last), but it takes progressively longer to find each values. I'm trying to find a way to add a time limit, say 10 seconds, after which my function does something with the best value received so far.
This is my current "solution", using a channel and timer:
// the goroutine which runs infinitely
// (or at least a very long time for high values of depth)
func runSearch(depth int, ch chan int) {
for i := 1; i <= depth; i++ {
fmt.Printf("Searching to depth %v\n", i)
ch <- search(i)
}
}
// consumes progressively better values until the channel is closed
func awaitBestResult(ch chan int) {
var result int
for result := range ch {
best = result
}
// do something with best result here
}
// run both consumer and producer
func main() {
timer := time.NewTimer(time.Millisecond * 2000)
ch := make(chan int)
go runSearch(1000, ch)
go awaitBestResult(ch)
<-timer.C
close(ch)
}
This mostly works - the best result is processed after the timer ends and the channel is closed. However, I then get a panic (panic: send on closed channel) from the runSearch goroutine, since the channel has been closed by the main function.
How can I stop the first goroutine running after the timer has completed? Any help is very appreciated.
You need to ensure that the goroutine knows when it is done processing, so that it doesn't attempt to write to a closed channel, and panic.
This sounds like a perfect case for the context package:
func runSearch(ctx context.Context, depth int, ch chan int) {
for i := 1; i <= depth; i++ {
select {
case <- ctx.Done()
// Context cancelled, return
return
default:
}
fmt.Printf("Searching to depth %v\n", i)
ch <- search(i)
}
}
Then in main():
// run both consumer and producer
func main() {
ctx := context.WithTimeout(context.Background, 2 * time.Second)
ch := make(chan int)
go runSearch(ctx, 1000, ch)
go awaitBestResult(ch)
close(ch)
}
You are getting a panic because your sending goroutine runSearch apparently outlives the timer and it is trying to send a value on the channel which is already closed by your main goroutine. You need to devise a way to signal the sending go routine not to send any values once your timer is lapsed and before you close the channel in main. On the other hand if your search gets over sooner you also need to communicate to main to move on. You can use one channel and synchronize so that there are no race conditions. And finally you need to know when your consumer has processed all the data before you can exit main.
Here's something which may help.
package main
import (
"fmt"
"sync"
"time"
)
var mu sync.Mutex //To protect the stopped variable which will decide if a value is to be sent on the signalling channel
var stopped bool
func search(i int) int {
time.Sleep(1 * time.Millisecond)
return (i + 1)
}
// (or at least a very long time for high values of depth)
func runSearch(depth int, ch chan int, stopSearch chan bool) {
for i := 1; i <= depth; i++ {
fmt.Printf("Searching to depth %v\n", i)
n := search(i)
select {
case <-stopSearch:
fmt.Println("Timer over! Searched till ", i)
return
default:
}
ch <- n
fmt.Printf("Sent depth %v result for processing\n", i)
}
mu.Lock() //To avoid race condition with timer also being
//completed at the same time as execution of this code
if stopped == false {
stopped = true
stopSearch <- true
fmt.Println("Search completed")
}
mu.Unlock()
}
// consumes progressively better values until the channel is closed
func awaitBestResult(ch chan int, doneProcessing chan bool) {
var best int
for result := range ch {
best = result
}
fmt.Println("Best result ", best)
// do something with best result here
//and communicate to main when you are done processing the result
doneProcessing <- true
}
func main() {
doneProcessing := make(chan bool)
stopSearch := make(chan bool)
// timer := time.NewTimer(time.Millisecond * 2000)
timer := time.NewTimer(time.Millisecond * 12)
ch := make(chan int)
go runSearch(1000, ch, stopSearch)
go awaitBestResult(ch, doneProcessing)
select {
case <-timer.C:
//If at the same time runsearch is also completed and trying to send a value !
//So we hold a lock before sending value on the channel
mu.Lock()
if stopped == false {
stopped = true
stopSearch <- true
fmt.Println("Timer expired")
}
mu.Unlock()
case <-stopSearch:
fmt.Println("runsearch goroutine completed")
}
close(ch)
//Wait for your consumer to complete processing
<-doneProcessing
//Safe to exit now
}
On playground. Change the value of timer to observe both the scenarios.