I have the following code:
i have a list to go through and do something with a value from that list, and so i thought of using go routines, but i need to use a max number of go routines, and then in go routine i need to make a call that will get a return of response, err, when the err is different from null I need to terminate all the go routines and return an http response, and if there is no err I need to terminate the go routines and return an http response,
When I have few values ββit works ok, but when I have many values ββI have a problem, because when I call cancel I will still have go routines trying to send to the response channel that is already closed and I keep getting errors from:
goroutine 36 [chan send]:
type response struct {
value string
}
func Testing() []response {
fakeValues := getFakeValues()
maxParallel := 25
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
if len(fakeValues) < maxParallel {
maxParallel = len(fakeValues)
}
type responseChannel struct {
Response response
Err error
}
reqChan := make(chan string) //make this an unbuffered channel
resChan := make(chan responseChannel)
wg := &sync.WaitGroup{}
wg.Add(maxParallel)
for i := 0; i < maxParallel; i++ {
go func(ctx context.Context, ch chan string, resChan chan responseChannel) {
for {
select {
case val := <-ch:
resp, err := getFakeResult(val)
resChan <- responseChannel{
Response: resp,
Err: err,
}
case <-ctx.Done():
wg.Done()
return
}
}
}(ctx, reqChan, resChan)
}
go func() {
for _, body := range fakeValues {
reqChan <- body
}
close(reqChan)
cancel()
}()
go func() {
wg.Wait()
close(resChan)
}()
var hasErr error
response := make([]response, 0, len(fakeValues))
for res := range resChan {
if res.Err != nil {
hasErr = res.Err
cancel()
break
}
response = append(response, res.Response)
}
if hasErr != nil {
// return responses.ErrorResponse(hasErr) // returns http response
}
// return responses.Accepted(response, nil) // returns http response
return nil
}
func getFakeValues() []string {
return []string{"a"}
}
func getFakeResult(val string) (response, error) {
if val == "" {
return response{}, fmt.Errorf("ooh noh:%s", val)
}
return response{
value: val,
}, nil
}
The workers end up blocked on sending to resChan because it's not buffered, and after an error, nothing reads from it.
You can either make resChan buffered, with a size at least as large as maxParallel. Or check to see if the context was canceled, e.g. change the resChan <- to
select {
case resChan <- responseChannel{
Response: resp,
Err: err,
}:
case <-ctx.Done():
}
There are two main problems with your solution:
First, if your fakeValues slice has more items than maxParallel+1, your program will block on this part:
for _, body := range fakeValues {
reqChan <- body
}
How does this happen? As you start putting values in reqChan, each started goroutine will read one value from the reqChan and try to write the response to resChan. But, since resChan is still not reading responses, each goroutine will block there (writing to resChan). Eventually, once each goroutine is blocked, reading from the reqChan is blocked as well and you cannot put any more values in it (apart from one buffered value).
Second, you are passing the context to your goroutines, but you are not doing anything with it. You can use ctx.Done() channel to get a signal to exit the goroutine. Something like this:
go func(ctx context.Context, ch chan string, resChan chan responseChannel) {
for {
select {
case val := <-ch:
resp, err := getFakeResult(val)
resChan <- responseChannel{
Response: resp,
Err: err,
}
case <- ctx.Done():
return
}
}
}(ctx, reqChan, resChan)
Now, to tie everything together so that there are no deadlocks, no race conditions, and no situations where values are not processed, a few other changes need to be made. I've posted the entire code below.
func Testing() []response {
fakeValues := getFakeValues()
maxParallel := 25
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
if len(fakeValues) < maxParallel {
maxParallel = len(fakeValues)
}
type responseChannel struct {
Response response
Err error
}
reqChan := make(chan string) //make this an unbuffered channel
resChan := make(chan responseChannel)
wg := &sync.WaitGroup{}
wg.Add(maxParallel)
for i := 0; i < maxParallel; i++ {
go func(ctx context.Context, ch chan string, resChan chan responseChannel) {
for {
select {
case val := <-ch:
resp, err := getFakeResult(val)
resChan <- responseChannel{
Response: resp,
Err: err,
}
case <- ctx.Done():
wg.Done()
return
}
}
wg.Done()
}(ctx, reqChan, resChan)
}
go func() {
for _, body := range fakeValues {
reqChan <- body
}
close(reqChan)
//putting cancel here so that it can terminate all goroutines when all values are read from reqChan
cancel()
}()
go func() {
wg.Wait()
close(resChan)
}()
var hasErr error
response := make([]response, 0, len(fakeValues))
for res := range resChan {
if res.Err != nil {
hasErr = res.Err
cancel()
break
}
response = append(response, res.Response)
}
if hasErr != nil {
return responses.ErrorResponse(hasErr) // returns http response
}
return responses.Accepted(response, nil) // returns http response
}
In short, the changes are:
reqChan is an unbuffered channel, as this will help in cases where values might not get processed when we close goroutines that read data from buffered channels.
worker goroutines have been changed to accommodate the cases of both exiting when error happens and when there is no more data from reqChan to process. wg.Done() is executed when the context is canceled to ensure that resChan is eventually closed.
separate goroutine is created to put the data in the reqChan without blocking the program, close it afterward, and cancel the context.
Related
I am having issue while using waitgroup with the buffered channel. The problem is waitgroup closes before channel is read completely, which make my channel is half read and break in between.
func main() {
var wg sync.WaitGroup
var err error
start := time.Now()
students := make([]studentDetails, 0)
studentCh := make(chan studentDetail, 10000)
errorCh := make(chan error, 1)
wg.Add(1)
go s.getDetailStudents(rCtx, studentCh , errorCh, &wg, s.Link, false)
go func(ch chan studentDetail, e chan error) {
LOOP:
for {
select {
case p, ok := <-ch:
if ok {
L.Printf("Links %s: [%s]\n", p.title, p.link)
students = append(students, p)
} else {
L.Print("Closed channel")
break LOOP
}
case err = <-e:
if err != nil {
break
}
}
}
}(studentCh, errorCh)
wg.Wait()
close(studentCh)
close(errorCh)
L.Warnln("closed: all wait-groups completed!")
L.Warnf("total items fetched: %d", len(students))
elapsed := time.Since(start)
L.Warnf("operation took %s", elapsed)
}
The problem is this function is recursive. I mean some http call to fetch students and then make more calls depending on condition.
func (s Student) getDetailStudents(rCtx context.Context, content chan<- studentDetail, errorCh chan<- error, wg *sync.WaitGroup, url string, subSection bool) {
util.MustNotNil(rCtx)
L := logger.GetLogger(rCtx)
defer func() {
L.Println("Closing all waitgroup!")
wg.Done()
}()
wc := getWC()
httpClient := wc.Registry.MustHTTPClient()
res, err := httpClient.Get(url)
if err != nil {
L.Fatal(err)
}
defer res.Body.Close()
if res.StatusCode != 200 {
L.Errorf("status code error: %d %s", res.StatusCode, res.Status)
errorCh <- errors.New("service_status_code")
return
}
// parse response and return error if found some through errorCh as done above.
// decide page subSection based on response if it is more.
if !subSection {
wg.Add(1)
go s.getDetailStudents(rCtx, content, errorCh, wg, link, true)
// L.Warnf("total pages found %d", pageSub.Length()+1)
}
// Find students from response list and parse each Student
students := s.parseStudentItemList(rCtx, item)
for _, student := range students {
content <- student
}
L.Warnf("Calling HTTP Service for %q with total %d record", url, elementsSub.Length())
}
Variables are changed to avoid original code base.
The problem is students are read randomly as soon as Waitgroup complete. I am expecting to hold the execution until all students are read, In case of error it should break as soon error encounter.
You need to know when the receiving goroutine completes. The WaitGroup does that for the generating goroutine. So, you can use two waitgroups:
wg.Add(1)
go s.getDetailStudents(rCtx, studentCh , errorCh, &wg, s.Link, false)
wgReader.Add(1)
go func(ch chan studentDetail, e chan error) {
defer wgReader.Done()
...
}
wg.Wait()
close(studentCh)
close(errorCh)
wgReader.Wait() // Wait for the readers to complete
Since you are using buffered channels you can retrieve the remaining values after closing the channel. You will also need a mechanism to prevent your main function from exiting too early while the reader is still doing work ,as #Burak Serdar has advised.
I restructured the code to give a working example but it should get the point across.
package main
import (
"context"
"log"
"sync"
"time"
)
type studentDetails struct {
title string
link string
}
func main() {
var wg sync.WaitGroup
var err error
students := make([]studentDetails, 0)
studentCh := make(chan studentDetails, 10000)
errorCh := make(chan error, 1)
start := time.Now()
wg.Add(1)
go getDetailStudents(context.TODO(), studentCh, errorCh, &wg, "http://example.com", false)
LOOP:
for {
select {
case p, ok := <-studentCh:
if ok {
log.Printf("Links %s: [%s]\n", p.title, p.link)
students = append(students, p)
} else {
log.Println("Draining student channel")
for p := range studentCh {
log.Printf("Links %s: [%s]\n", p.title, p.link)
students = append(students, p)
}
break LOOP
}
case err = <-errorCh:
if err != nil {
break LOOP
}
case <-wrapWait(&wg):
close(studentCh)
}
}
close(errorCh)
elapsed := time.Since(start)
log.Printf("operation took %s", elapsed)
}
func getDetailStudents(rCtx context.Context, content chan<- studentDetails, errorCh chan<- error, wg *sync.WaitGroup, url string, subSection bool) {
defer func() {
log.Println("Closing")
wg.Done()
}()
if !subSection {
wg.Add(1)
go getDetailStudents(rCtx, content, errorCh, wg, url, true)
// L.Warnf("total pages found %d", pageSub.Length()+1)
}
content <- studentDetails{
title: "title",
link: "link",
}
}
// helper function to allow using WaitGroup in a select
func wrapWait(wg *sync.WaitGroup) <-chan struct{} {
out := make(chan struct{})
go func() {
wg.Wait()
out <- struct{}{}
}()
return out
}
wg.Add(1)
go func(){
defer wg.Done()
// I do not think that you need a recursive function.
// this function overcomplicated.
s.getDetailStudents(rCtx, studentCh , errorCh, &wg, s.Link, false)
}(...)
wg.Add(1)
go func(ch chan studentDetail, e chan error) {
defer wg.Done()
...
}(...)
wg.Wait()
close(studentCh)
close(errorCh)
This should solve the problem. s.getDetailStudents function must be simplified. Making it recursive does not have any benefit.
I have a function works before using goroutine:
res, err := example(a , b)
if err != nil {
return Response{
ErrCode: 1,
ErrMsg:"error",
}
}
Response is a struct defined error info. When I use goroutine:
var wg sync.WaitGroup()
wg.Add(1)
go func(){
defer wg.Done()
res, err := example(a , b)
if err != nil {
return Response{
ErrCode: 1,
ErrMsg:"error",
}
}()
wg.Wait()
Then I got
too many arguments to return
have (Response)
want ()
You need to use channel to achieve what you want:
func main() {
c := make(chan Response)
go func() {
res, err := example(a , b)
if err != nil {
c <- Response{
ErrCode: 1,
ErrMsg:"error",
}
}
}()
value := <-c
}
The function you provide to span a go routine has no return in its signature. Go routines cannot return data. Running goroutine (asynchronously) and fetch return value from function are essentially contradictory actions. Simply put, goroutine cannot know where to return the data. Hence it doesnot allow it.
You can do something like this:
var response Response
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
res, err := example(a, b)
if err != nil {
response = res
}
}()
wg.Wait()
return response
I have a simple kafka consumer for which I have created a handle and trying to read it using a go routine:
func process(ctx context.Context){
consumer := queueHandle.Consume(topic_ops_req, consumerHandler)
// Get signal for finish
doneCh := make(chan struct{})
go func(consumer chan *sarama.ConsumerMessage, ctx context.Context) {
for {
select {
case msg, ok := <-consumer:
if !ok {
logger.Info("Channel has been closed")
doneCh <- struct{}{}
return
}
var request queue.Request
err := json.Unmarshal(msg.Value, &request)
if err != nil {
logger.Error("consumer unmarshal err", err)
panic(err)
}
res, err := new_process(ctx, request, service) // call another func
if err != nil {
//TODO
}
result = res
doneCh <- struct{}{}
case <-ctx.Done():
logger.Info(fmt.Sprintf("Context ended with err : %s", ctx.Err()))
doneCh <- struct{}{}
}
}
}(consumer, ctx)
<-doneCh
}
The issue I am seeing is that once I introduce the "case <-ctx.Done()", the go routine does not enter the "case msg, ok := <-consumer" and always returns that the context ended. How do I my go func work with both consumer channel and ctx.Done() ?
In a function definition, if a channel is an argument without a direction, does it have to send or receive something?
func makeRequest(url string, ch chan<- string, results chan<- string) {
start := time.Now()
resp, err := http.Get(url)
defer resp.Body.Close()
if err != nil {
fmt.Printf("%v", err)
}
resp, err = http.Post(url, "text/plain", bytes.NewBuffer([]byte("Hey")))
defer resp.Body.Close()
secs := time.Since(start).Seconds()
if err != nil {
fmt.Printf("%v", err)
}
// Cannot move past this.
ch <- fmt.Sprintf("%f", secs)
results <- <- ch
}
func MakeRequestHelper(url string, ch chan string, results chan string, iterations int) {
for i := 0; i < iterations; i++ {
makeRequest(url, ch, results)
}
for i := 0; i < iterations; i++ {
fmt.Println(<-ch)
}
}
func main() {
args := os.Args[1:]
threadString := args[0]
iterationString := args[1]
url := args[2]
threads, err := strconv.Atoi(threadString)
if err != nil {
fmt.Printf("%v", err)
}
iterations, err := strconv.Atoi(iterationString)
if err != nil {
fmt.Printf("%v", err)
}
channels := make([]chan string, 100)
for i := range channels {
channels[i] = make(chan string)
}
// results aggregate all the things received by channels in all goroutines
results := make(chan string, iterations*threads)
for i := 0; i < threads; i++ {
go MakeRequestHelper(url, channels[i], results, iterations)
}
resultSlice := make([]string, threads*iterations)
for i := 0; i < threads*iterations; i++ {
resultSlice[i] = <-results
}
}
In the above code,
ch <- or <-results
seems to be blocking every goroutine that executes makeRequest.
I am new to concurrency model of Go. I understand that sending to and receiving from a channel blocks but find it difficult what is blocking what in this code.
I'm not really sure that you are doing... It seems really convoluted. I suggest you read up on how to use channels.
https://tour.golang.org/concurrency/2
That being said you have so much going on in your code that it was much easier to just gut it to something a bit simpler. (It can be simplified further). I left comments to understand the code.
package main
import (
"fmt"
"io/ioutil"
"log"
"net/http"
"sync"
"time"
)
// using structs is a nice way to organize your code
type Worker struct {
wg sync.WaitGroup
semaphore chan struct{}
result chan Result
client http.Client
}
// group returns so that you don't have to send to many channels
type Result struct {
duration float64
results string
}
// closing your channels will stop the for loop in main
func (w *Worker) Close() {
close(w.semaphore)
close(w.result)
}
func (w *Worker) MakeRequest(url string) {
// a semaphore is a simple way to rate limit the amount of goroutines running at any single point of time
// google them, Go uses them often
w.semaphore <- struct{}{}
defer func() {
w.wg.Done()
<-w.semaphore
}()
start := time.Now()
resp, err := w.client.Get(url)
if err != nil {
log.Println("error", err)
return
}
defer resp.Body.Close()
// don't have any examples where I need to also POST anything but the point should be made
// resp, err = http.Post(url, "text/plain", bytes.NewBuffer([]byte("Hey")))
// if err != nil {
// log.Println("error", err)
// return
// }
// defer resp.Body.Close()
secs := time.Since(start).Seconds()
b, err := ioutil.ReadAll(resp.Body)
if err != nil {
log.Println("error", err)
return
}
w.result <- Result{duration: secs, results: string(b)}
}
func main() {
urls := []string{"https://facebook.com/", "https://twitter.com/", "https://google.com/", "https://youtube.com/", "https://linkedin.com/", "https://wordpress.org/",
"https://instagram.com/", "https://pinterest.com/", "https://wikipedia.org/", "https://wordpress.com/", "https://blogspot.com/", "https://apple.com/",
}
workerNumber := 5
worker := Worker{
semaphore: make(chan struct{}, workerNumber),
result: make(chan Result),
client: http.Client{Timeout: 5 * time.Second},
}
// use sync groups to allow your code to wait for
// all your goroutines to finish
for _, url := range urls {
worker.wg.Add(1)
go worker.MakeRequest(url)
}
// by declaring wait and close as a seperate goroutine
// I can get to the for loop below and iterate on the results
// in a non blocking fashion
go func() {
worker.wg.Wait()
worker.Close()
}()
// do something with the results channel
for res := range worker.result {
fmt.Printf("Request took %2.f seconds.\nResults: %s\n\n", res.duration, res.results)
}
}
The channels in channels are nil (no make is executed; you make the slice but not the channels), so any send or receive will block. I'm not sure exactly what you're trying to do here, but that's the basic problem.
See https://golang.org/doc/effective_go.html#channels for an explanation of how channels work.
I am trying to catch errors from a group of goroutines using a channel, but the channel enters an infinite loop, starts consuming CPU.
func UnzipFile(f *bytes.Buffer, location string) error {
zipReader, err := zip.NewReader(bytes.NewReader(f.Bytes()), int64(f.Len()))
if err != nil {
return err
}
if err := os.MkdirAll(location, os.ModePerm); err != nil {
return err
}
errorChannel := make(chan error)
errorList := []error{}
go errorChannelWatch(errorChannel, errorList)
fileWaitGroup := &sync.WaitGroup{}
for _, file := range zipReader.File {
fileWaitGroup.Add(1)
go writeZipFileToLocal(file, location, errorChannel, fileWaitGroup)
}
fileWaitGroup.Wait()
close(errorChannel)
log.Println(errorList)
return nil
}
func errorChannelWatch(ch chan error, list []error) {
for {
select {
case err := <- ch:
list = append(list, err)
}
}
}
func writeZipFileToLocal(file *zip.File, location string, ch chan error, wg *sync.WaitGroup) {
defer wg.Done()
zipFilehandle, err := file.Open()
if err != nil {
ch <- err
return
}
defer zipFilehandle.Close()
if file.FileInfo().IsDir() {
if err := os.MkdirAll(filepath.Join(location, file.Name), os.ModePerm); err != nil {
ch <- err
}
return
}
localFileHandle, err := os.OpenFile(filepath.Join(location, file.Name), os.O_WRONLY|os.O_CREATE|os.O_TRUNC, file.Mode())
if err != nil {
ch <- err
return
}
defer localFileHandle.Close()
if _, err := io.Copy(localFileHandle, zipFilehandle); err != nil {
ch <- err
return
}
ch <- fmt.Errorf("Test error")
}
So I am looping a slice of files and writing them to my disk, when there is an error I report back to the errorChannel to save that error into a slice.
I use a sync.WaitGroup to wait for all goroutines and when they are done I want to print errorList and check if there was any error during the execution.
The list is always empty, even if I add ch <- fmt.Errorf("test") at the end of writeZipFileToLocal and the channel always hangs up.
I am not sure what I am missing here.
1. For the first point, the infinite loop:
Citing from golang language spec:
A receive operation on a closed channel can always proceed
immediately, yielding the element type's zero value after any
previously sent values have been received.
So in this function
func errorChannelWatch(ch chan error, list []error) {
for {
select {
case err := <- ch:
list = append(list, err)
}
}
}
after ch gets closed this turns into an infinite loop adding nil values to list.
Try this instead:
func errorChannelWatch(ch chan error, list []error) {
for err := range ch {
list = append(list, err)
}
}
2. For the second point, why you don't see anything in your error list:
The problem is this call:
errorChannel := make(chan error)
errorList := []error{}
go errorChannelWatch(errorChannel, errorList)
Here you hand errorChannelWatch the errorList as a value. So the slice errorList will not be changed by the function. What is changed, is the underlying array, as long as the append calls don't need to allocate a new one.
To remedy the situation, either hand a slice pointer to errorChannelWatch or rewrite it as a call to a closure, capturing
errorList.
For the first proposed solution, change errorChannelWatch to
func errorChannelWatch(ch chan error, list *[]error) {
for err := range ch {
*list = append(*list, err)
}
}
and the call to
errorChannel := make(chan error)
errorList := []error{}
go errorChannelWatch(errorChannel, &errorList)
For the second proposed solution, just change the call to
errorChannel := make(chan error)
errorList := []error{}
go func() {
for err := range errorChannel {
errorList = append(errorList, err)
}
} ()
3. A minor remark:
One could think, that there is a synchronisation problem here:
fileWaitGroup.Wait()
close(errorChannel)
log.Println(errorList)
How can you be sure, that errorList isn't modified, after the call to close? One could reason, that you can't know, how many values the goroutine errorChannelWatch still has to process.
Your synchronisation seems correct to me, as you do the wg.Done()
after the send to the error channel and so all error values will
be sent, when fileWaitGroup.Wait() returns.
But that can change, if someone later adds a buffering to the error
channel or alters the code.
So I would advise to at least explain the synchronisation in a comment.