go routine end before done - go

I'm tying to execute things async with multiple go routines. I pass in the number of "threads" to use to process the files async. The files is an array of Strings to process.
queue := make(chan string)
threadCount := c.Int("threads")
if c.Int("threads") < len(files) {
threadCount = len(files)
}
log.Infof("Starting %i processes", c.Int("threads"))
for i := 0; i < threadCount; i++ {
go renderGoRoutine(queue)
}
for _, f := range files {
queue <- f
}
close(queue)
And the routine itself looks like this:
func renderGoRoutine(queue chan string) {
for file := range queue {
// do some heavy lifting stuff with the file
}
}
This does work fine whenever i use just one thread. As soon as i take more then one it does exit/leave the scope before it is done with all the go routines.
How do I make it process everything?
Previous question: Using a channel for dispatching tasks to go routine

Using WaitGroups is an option.
At the beginning, you add number tasks into WaitGroup and after each task is done decrement counter in the WaitGroup. Wait until all tasks are finished at the end of your code flow.
See the example: https://godoc.org/sync#WaitGroup
Your code will look like this:
queue := make(chan string)
wg := sync.WaitGroup{}
wg.Add(len(files))
threadCount := c.Int("threads")
if c.Int("threads") < len(files) {
threadCount = len(files)
}
log.Infof("Starting %i processes", c.Int("threads"))
for i := 0; i < threadCount; i++ {
go renderGoRoutine(queue)
}
for _, f := range files {
queue <- f
}
close(queue)
wg.Wait()
renderGoRoutine:
func renderGoRoutine(queue chan string) {
for file := range queue {
// do some heavy lifting stuff with the file
// decrement the waitGroup counter
wg.Done()
}
}

You are using the channel to publish work to be done. As soon as the last item is taken from the queue (not finished processing), your program exits.
You could use a channel to write to at the end of renderGoRoutine to signal the end of processing.
At the top:
sync := make(chan bool)
In renderGoRoutine at the end (Assuming it is in the same file):
sync <- true
At the bottom:
for f := range sync {
<- sync
}
Now your program waits until the number of files are processed.
Or to have a complete example:
queue := make(chan string)
sync := make(chan bool)
threadCount := c.Int("threads")
if c.Int("threads") < len(files) {
threadCount = len(files)
}
log.Infof("Starting %i processes", c.Int("threads"))
for i := 0; i < threadCount; i++ {
go renderGoRoutine(queue)
}
for _, f := range files {
queue <- f
}
close(queue)
for f := range sync {
<- sync
}
And the routine should be changed like this:
func renderGoRoutine(queue chan string) {
for file := range queue {
// do some heavy lifting stuff with the file
sync <- true
}
}

I did forget to wait for all tasks to finish. This can simply be done by waiting for all loops to end. Since close(channel) does end the for range channel a simple sync with a channel can be used like so:
sync := make(chan bool)
queue := make(chan string)
threadCount := c.Int("threads")
if c.Int("threads") < len(files) {
threadCount = len(files)
}
log.Infof("Starting %i processes", c.Int("threads"))
for i := 0; i < threadCount; i++ {
go renderGoRoutine(queue)
}
for _, f := range files {
queue <- f
}
close(queue)
for i := 0; i < threadCount; i++ {
<- sync
}
And last but not least write to the channel whenever a iteration is stopped.
func renderGoRoutine(queue chan string) {
for file := range queue { //whatever is done here
}
sync <- true
}

Related

Unable to find reason for go deadlock

Unable to find a reason as to why this code deadlocks. The aim here is make the worker go routines do some work only after they are signaled.
If the signalStream channel is removed from the code, it works fine. But when this is introduced, it deadlocks. Not sure the reason for this. Also if there are any tools to explain the occurrence of deadlock, it would help.
package main
import (
"log"
"sync"
)
const jobs = 10
const workers = 5
var wg sync.WaitGroup
func main() {
// create channel
dataStream := make(chan interface{})
signalStream := make(chan interface{})
// Generate workers
for i := 1; i <= workers; i++ {
wg.Add(1)
go worker(dataStream, signalStream, i*100)
}
// Generate jobs
for i := 1; i <= jobs; i++ {
dataStream <- i
}
close(dataStream)
// start consuming data
close(signalStream)
wg.Wait()
}
func worker(c <-chan interface{}, s <-chan interface{}, id int) {
defer wg.Done()
<-s
for i := range c {
log.Printf("routine - %d - %d \n", id, i)
}
}
Generate the jobs in a separate gorouine, i.e. put the whole jobs loop into a goroutine. If you don't then dataStream <- i will block and your program will never "start consuming data"
// Generate jobs
go func() {
for i := 1; i <= jobs; i++ {
dataStream <- i
}
close(dataStream)
}()
https://go.dev/play/p/ChlbsJlgwdE

Deadlock using channels as queues

I'm learning Go and I am trying to implement a job queue.
What I'm trying to do is:
Have the main goroutine feed lines through a channel for multiple parser workers (that parse a line to s struct), and have each parser send the struct to a channel of structs that other workers (goroutines) will process (send to database, etc).
The code looks like this:
lineParseQ := make(chan string, 5)
jobProcessQ := make(chan myStruct, 5)
doneQ := make(chan myStruct, 5)
fileName := "myfile.csv"
file, err := os.Open(fileName)
if err != nil {
log.Fatal(err)
}
defer file.Close()
reader := bufio.NewReader(file)
// Start line parsing workers and send to jobProcessQ
for i := 1; i <= 2; i++ {
go lineToStructWorker(i, lineParseQ, jobProcessQ)
}
// Process myStruct from jobProcessQ
for i := 1; i <= 5; i++ {
go WorkerProcessStruct(i, jobProcessQ, doneQ)
}
lineCount := 0
countSend := 0
for {
line, err := reader.ReadString('\n')
if err != nil && err != io.EOF {
log.Fatal(err)
}
if err == io.EOF {
break
}
lineCount++
if lineCount > 1 {
countSend++
lineParseQ <- line[:len(line)-1] // Avoid last char '\n'
}
}
for i := 0; i < countSend; i++ {
fmt.Printf("Received %+v.\n", <-doneQ)
}
close(doneQ)
close(jobProcessQ)
close(lineParseQ)
Here's a simplified playground: https://play.golang.org/p/yz84g6CJraa
the workers look like this:
func lineToStructWorker(workerID int, lineQ <-chan string, strQ chan<- myStruct ) {
for j := range lineQ {
strQ <- lineToStruct(j) // just parses the csv to a struct...
}
}
func WorkerProcessStruct(workerID int, strQ <-chan myStruct, done chan<- myStruct) {
for a := range strQ {
time.Sleep(time.Millisecond * 500) // fake long operation...
done <- a
}
}
I know the problem is related to the "done" channel because if I don't use it, there's no error, but I can't figure out how to fix it.
You don't start reading from doneQ until you've finished sending all the lines to lineParseQ, which is more lines than there is buffer space. So once the doneQ buffer is full, that send blocks, which starts filling the lineParseQ buffer, and once that's full, it deadlocks. Move either the loop sending to lineParseQ, the loop reading from doneQ, or both, to separate goroutine(s), e.g.:
go func() {
for _, line := range lines {
countSend++
lineParseQ <- line
}
close(lineParseQ)
}()
This will still deadlock at the end, because you've got a range over a channel and the close after it in the same goroutine; since range continues until the channel is closed, and the close comes after the range finishes, you still have a deadlock. You need to put the closes in appropriate places; that being, either in the sending routine, or blocked on a WaitGroup monitoring the sending routines if there are multiple senders for a given channel.
// Start line parsing workers and send to jobProcessQ
wg := new(sync.WaitGroup)
for i := 1; i <= 2; i++ {
wg.Add(1)
go lineToStructWorker(i, lineParseQ, jobProcessQ, wg)
}
// Process myStruct from jobProcessQ
for i := 1; i <= 5; i++ {
go WorkerProcessStruct(i, jobProcessQ, doneQ)
}
countSend := 0
go func() {
for _, line := range lines {
countSend++
lineParseQ <- line
}
close(lineParseQ)
}()
go func() {
wg.Wait()
close(jobProcessQ)
}()
for a := range doneQ {
fmt.Printf("Received %v.\n", a)
}
// ...
func lineToStructWorker(workerID int, lineQ <-chan string, strQ chan<- myStruct, wg *sync.WaitGroup) {
for j := range lineQ {
strQ <- lineToStruct(j) // just parses the csv to a struct...
}
wg.Done()
}
func WorkerProcessStruct(workerID int, strQ <-chan myStruct, done chan<- myStruct) {
for a := range strQ {
time.Sleep(time.Millisecond * 500) // fake long operation...
done <- a
}
close(done)
}
Full working example here: https://play.golang.org/p/XsnewSZeb2X
Coordinate the pipeline with sync.WaitGroup breaking each piece into stages. When you know one piece of the pipeline is complete (and no one is writing to a particular channel), close the channel to instruct all "workers" to exit e.g.
var wg sync.WaitGroup
for i := 1; i <= 5; i++ {
i := i
wg.Add(1)
go func() {
Worker(i)
wg.Done()
}()
}
// wg.Wait() signals the above have completed
Buffered channels are handy to handle burst workloads, but sometimes they are used to avoid deadlocks in poor designs. If you want to avoid running certain parts of your pipeline in a goroutine you can buffer some channels (matching the number of workers typically) to avoid a blockage in your main goroutine.
If you have dependent pieces that read & write and want to avoid deadlock - ensure they are in separate goroutines. Having all parts of the pipeline it its own goroutine will even remove the need for buffered channels:
// putting all channel work into separate goroutines
// removes the need for buffered channels
lineParseQ := make(chan string, 0)
jobProcessQ := make(chan myStruct, 0)
doneQ := make(chan myStruct, 0)
Its a tradeoff of course - a goroutine costs about 2K in resources - versus a buffered channel which is much less. As with most designs it depends on how it is used.
Also don't get caught by the notorious Go for-loop gotcha, so use a closure assignment to avoid this:
for i := 1; i <= 5; i++ {
i := i // new i (not the i above)
go func() {
myfunc(i) // otherwise all goroutines will most likely get '5'
}()
}
Finally ensure you wait for all results to be processed before exiting.
It's a common mistake to return from a channel based function and believe all results have been processed. In a service this will eventually be true. But in a standalone executable the processing loop may still be working on results.
go func() {
wgW.Wait() // waiting on worker goroutines to finish
close(doneQ) // safe to close results channel now
}()
// ensure we don't return until all results have been processed
for a := range doneQ {
fmt.Printf("Received %v.\n", a)
}
by processing the results in the main goroutine, we ensure we don't return prematurely without having processed everything.
Pulling it all together:
https://play.golang.org/p/MjLpQ5xglP3

Why is there such a result?

I wrote a golang script to scan for open ports and use sync.WaitGourp to control the number of goroutines.
When the goroutine is too large, such as 2000, the result is different from 1000.
Similar to exiting early. code show as below
func worker(wg *sync.WaitGroup) {
for job := range jobs {
_, err := net.DialTimeout("tcp", fmt.Sprintf("%s:%d", job.host, job.port), time.Millisecond*1500)
if err != nil {
results <- Result{job, false}
} else {
results <- Result{job, true}
}
}
wg.Done()
}
func main() {
go func() {
for i := 1; i < 65535; i++ {
jobs <- Job{host, i}
}
close(jobs)
}()
go func() {
for result := range results {
if result.status {
fmt.Println(result.job, "open")
}
}
}()
wg := sync.WaitGroup{}
for i := 1; i < 1000; i++ {
wg.Add(1)
go worker(&wg)
}
wg.Wait()
}
when 1000
{127.0.0.1 80} open
{127.0.0.1 631} open
{127.0.0.1 3306} open
{127.0.0.1 6379} open
{127.0.0.1 33060} open
when 2000
{127.0.0.1 80} open
{127.0.0.1 631} open
I want 2000 to output all ports like 1000
You do not wait for the two "non-worker" goroutines in main, so as soon as wg.Wait() there returns, the process shuts down, tearing down any outstanding goroutines.
Since one of them is processing the results, this appears to you as if not all the tasks were processed (and this is true).
Close the results channel when workers are done. Process the results in the main goroutine.
wg := sync.WaitGroup{}
for i := 1; i < 1000; i++ {
wg.Add(1)
go worker(&wg)
}
go func() {
for i := 1; i < 65535; i++ {
jobs <- Job{host, i}
}
// No more jobs, exit from worker loops.
close(jobs)
// Wait for workers to write all results and exit.
wg.Wait()
// No more results, exit from main loop.
close(results)
}()
for result := range results {
if result.status {
fmt.Println(result.job, "open")
}
}
View the complete program on the GoLang PlayGround.

Waitgroups and synchronization not working in Go

I've been experimenting with goroutines and channels, and I wanted to test the WaitGroup feature. Here I'm trying to execute an HTTP flood job, where the parent thread spawns a lot of goroutines which will make infinite requests, unless receiving a stop message:
func (hf *HTTPFlood) Run() {
childrenStop := make(chan int, hf.ConcurrentCalls)
stop := false
totalRequests := 0
requestsChan := make(chan int)
totalErrors := 0
errorsChan := make(chan int)
var wg sync.WaitGroup
for i := 0; i < hf.ConcurrentCalls; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for {
select {
case <-childrenStop:
fmt.Printf("stop child\n")
return
default:
_, err := Request(hf.Victim.String())
requestsChan <- 1
if err != nil {
errorsChan <- 1
}
}
}
}()
}
timeout := time.NewTimer(time.Duration(MaximumJobTime) * time.Second)
for !stop {
select {
case req := <- requestsChan:
totalReq += req
case err := <- errorsChan:
totalErrors += err
case <- timeout.C:
fmt.Printf("%s timed up\n", hf.Victim.String())
for i := 0; i < hf.ConcurrentCalls; i++ {
childrenStop <- 1
}
close(childrenStop)
stop = true
break
}
}
fmt.Printf("waiting\n")
wg.Wait()
fmt.Printf("after wait\n")
close(requestsChan)
close(errorsChan)
fmt.Printf("end\n")
}
Once timeout is fired, the parent thread successfully exits the loop and reaches the Wait instruction, but even though the stopChildren channel is filled, the child goroutines seem to never receive messages on the stopChildren channel.
What am I missing?
EDIT:
So the issue obviously was how the channels and its sends/receives were managed.
First of all the childrenStop channel was closed before all childs had received the message. The channel should be closed after the Wait
On the other hand, since no reads were done neither on requestsChan nor errorsChan once the parent thread sends the stop signal, most of the childs stayed blocked sending on these two channels. I tried to keep reading in the parent thread, outside the loop just before the Wait but that didn't work so I switched the implementation to Atomic counters which seem to be a more suitable way to manage this specific use case.
func (hf *HTTPFlood) Run() {
childrenStop := make(chan int, hf.ConcurrentCalls)
var totalReq uint64
var totalErrors uint64
var wg sync.WaitGroup
for i := 0; i < hf.ConcurrentCalls; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for {
select {
case <-childrenStop:
fmt.Printf("stop child\n")
return
default:
_, err := Request(hf.Victim.String())
atomic.AddUint64(&totalReq, 1)
if err != nil {
atomic.AddUint64(&totalErrors, 1)
}
}
}
}()
}
timeout := time.NewTimer(time.Duration(MaximumJobTime) * time.Second)
<- timeout.C
fmt.Printf("%s timed up\n", hf.Victim.String())
for i := 0; i < hf.ConcurrentCalls; i++ {
childrenStop <- 1
}
fmt.Printf("waiting\n")
wg.Wait()
fmt.Printf("after wait\n")
close(childrenStop)
fmt.Printf("end\n")
}
Your go routines can be is blocked at requestsChan <- 1.
case <- timeout.C:
fmt.Printf("%s timed up\n", hf.Victim.String())
for i := 0; i < hf.ConcurrentCalls; i++ {
childrenStop <- 1
}
close(childrenStop)
stop = true
break
Here you are sending a number to childrenStop and expect the go routines to receive it. But while you are sending the childrenStop signal, your routines could have sent something on requestsChan. But as you break from the loop after sending the close signals, there's no one listening on requestsChan to receive.
You can confirm this by printing something just before and after requestsChan <- 1 to confirm the behaviour.
A channel will block when you send something on it while no one is receiving on the other end
Here is a possible modification.
package main
import (
"fmt"
"time"
)
func main() {
requestsChan := make(chan int)
done := make(chan chan bool)
for i := 0; i < 5; i++ {
go func(it int) {
for {
select {
case c := <-done:
c <- true
return
default:
requestsChan <- it
}
}
}(i)
}
max := time.NewTimer(1 * time.Millisecond)
allChildrenDone := make(chan bool)
childrenDone := 0
childDone := make(chan bool)
go func() {
for {
select {
case i := <-requestsChan:
fmt.Printf("received %d;", i)
case <-max.C:
fmt.Println("\nTimeup")
for i := 0; i < 5; i++ {
go func() {
done <- childDone
fmt.Println("sent done")
}()
}
case <-childDone:
childrenDone++
fmt.Println("child done ", childrenDone)
if childrenDone == 5 {
allChildrenDone <- true
return
}
}
}
}()
fmt.Println("Waiting")
<-allChildrenDone
}
Thing to note here is that am sending the close signal in go routines so that the loop can continue while i wait for all the children to have cleanly exit.
Please watch this talk by Rob Pike which covers these details clearly.
[Edit]: The previous code would have resulted in a running routine after exiting.

Synchronize workers for recursive crawl

I would like to implement a "crawler" with n workers where each worker is able to add additional jobs. The program should stop when there are no jobs left and all workers have finished their work.
I have the following code (you can play with it at https://play.golang.org/p/_j22p_OfYv):
package main
import (
"fmt"
"sync"
)
func main() {
pathChan := make(chan string)
fileChan := make(chan string)
workers := 3
var wg sync.WaitGroup
paths := map[string][]string{
"/": {"/test", "/foo", "a", "b"},
"/test": {"aa", "bb", "cc"},
"/foo": {"/bar", "bbb", "ccc"},
"/bar": {"aaaa", "bbbb", "cccc"},
}
for i := 0; i < workers; i++ {
wg.Add(1)
go func() {
for {
path, ok := <-pathChan
if !ok {
break
}
for _, f := range paths[path] {
if f[0] == '/' {
pathChan <- f
} else {
fileChan <- f
}
}
}
wg.Done()
}()
}
pathChan <- "/"
for {
filePath, ok := <-fileChan
if !ok {
break
}
fmt.Println(filePath)
}
wg.Wait()
close(pathChan)
}
Unfortunately, this ends in a dead-lock. Where exactly is the problem? Also, what is the best practice to write such functionality? Are channels the correct feature to use?
EDIT:
I have updated my code to use two wait groups, one for the jobs and one for the workers (see https://play.golang.org/p/bueUJzMhqj):
package main
import (
"fmt"
"sync"
)
func main() {
pathChan := make(chan string)
fileChan := make(chan string)
jobs := new(sync.WaitGroup)
workers := new(sync.WaitGroup)
nworkers := 2
paths := map[string][]string{
"/": {"/test", "/foo", "a", "b"},
"/test": {"aa", "bb", "cc"},
"/foo": {"/bar", "bbb", "ccc"},
"/bar": {"aaaa", "bbbb", "cccc"},
}
for i := 0; i < nworkers; i++ {
workers.Add(1)
go func() {
defer workers.Done()
for {
path, ok := <-pathChan
if !ok {
break
}
for _, f := range paths[path] {
if f[0] == '/' {
jobs.Add(1)
pathChan <- f
} else {
fileChan <- f
}
}
jobs.Done()
}
}()
}
jobs.Add(1)
pathChan <- "/"
go func() {
jobs.Wait()
close(pathChan)
workers.Wait()
close(fileChan)
}()
for {
filePath, ok := <-fileChan
if !ok {
break
}
fmt.Println(filePath)
}
}
This indeed seems to work, but obviously a deadlock will still happen if nworkers is set to 1, because the single worker will wait forever when adding something to the channel pathChan. To solve this issue, the channel buffer can be increased (e.g. pathChan := make(chan string, 2)), but this will only work as long as two buffer isn't completely full. Of course, the buffer size could be set to a large number, say 10000, but the code could still hit a deadlock. Additionally, this doesn't seem to be a clean solution to me.
This is where I realized that it would be easier to use some sort of queue instead of a channel, where elements can be added and removed without blocking and where the size of the queue isn't fixed. Do such queues exist in the Go standard library?
If you want to wait for an arbitrary number of workers to finish, the standard library includes sync.WaitGroup for exactly this purpose.
There are other concurrency issues as well:
You're using channel closure signalling, but you have multiple goroutines sending on the same channel. This is generally bad practice: since each routine can never know when the other routines are done with the channel, you can never correctly close the channel.
Closing one channel waits on the other to be closed first, but it will never be closed, so it deadlocks.
The only reason it doesn't deadlock immediately is your example happens to have more workers than directories under "/". Add two more directories under "/" and it deadlocks immediately.
There are some solutions:
Dump the worker pool and just spin a goroutine for every subdirectory, and let the scheduler worry about the rest: https://play.golang.org/p/ck2DkNFnyF
Use one worker per root-level directory, and have each worker process its directory recursively rather than queuing subdirectories it finds to a channel.

Resources