Aggregating multiple results from go routines into a single array - go

I have the following function which spins off a given amount of go routines
func (r *Runner) Execute() {
var wg sync.WaitGroup
wg.Add(len(r.pipelines))
for _, p := range r.pipelines {
go executePipeline(p, &wg)
}
wg.Wait()
errs := ....//contains list of errors reported by any/all go routines
}
I was thinking there might be some way with channels, but I can't seem to figure it out.

One way to do this is using mutexes if you can make executePipeline retuen errors:
// ...
for _, p := range r.pipelines {
go func(p pipelineType) {
if err := executePipeline(p, &wg); err != nil {
mu.Lock()
errs = append(errs, err)
mu.UnLock()
}
}(p)
}
To use a channel, you can have a separate goroutine listning for errors:
errCh := make(chan error)
go func() {
for e := range errCh {
errs = append(errs, e)
}
}
and in the Execute function, make the following changes:
// ...
wg.Add(len(r.pipelines))
for _, p := range r.pipelines {
go func(p pipelineType) {
if err := executePipeline(p, &wg); err != nil {
errCh <- err
}
}(p)
}
wg.Wait()
close(errCh)
You can always use #zerkms method listed above if the number of goroutines is not high.
instead of returning error from executePipleline and using a anonymous function wrapper, you can always make above changes within the function itself.

You can use channels as #Kaveh Shahbazian suggested:
func (r *Runner) Execute() {
pipelineChan := makePipeline(r.pipelines)
for cnt := 0; cnt < len(r.pipelines); cnt++{
//recieve from channel
p := <- pipelineChan
//do something with the result
}
}
func makePipeline(pipelines []pipelineType) <-chan pipelineType{
pipelineChan := make(chan pipelineType)
go func(){
for _, p := range pipelines {
go func(p pipelineType){
pipelineChan <- executePipeline(p)
}(p)
}
}()
return pipelineChan
}
Please see this example: https://gist.github.com/steven-ferrer/9b2eeac3eed3f7667e8976f399d0b8ad

Related

How to catch errors from Goroutines?

I've created function which needs to run in goroutine, the code is working(this is just simple sample to illustrate the issue)
go rze(ftFilePath, 2)
func rze(ftDataPath,duration time.Duration) error {
}
I want to do something like this
errs := make(chan error, 1)
err := go rze(ftFilePath, 2)
if err != nil{
r <- Result{result, errs}
}
but not sure how to it, most of the examples show
how you do it when you using func
https://tour.golang.org/concurrency/5
You cannot use the return value of a function that is executed with the go keyword. Use an anonymous function instead:
errs := make(chan error, 1)
go func() {
errs <- rze(ftFilePath, 2)
}()
// later:
if err := <-errs; err != nil {
// handle error
}
You can use golang errgroup pkg.
var eg errgroup.Group
eg.Go(func() error {
return rze(ftFilePath, 2)
})
if err := g.Wait(); err != nil {
r <- Result{result, errs}
}
You can handle error from go routines using a separate channel for errors. Create a separate channel specifically for errors. Each child go routine must pass corresponding errors into this channel.
This is one of the ways how it can be done:
func main() {
type err error
items := []string{"1", "2", "3"}
ch := make(chan err, len(items))
for _, f := range items {
go func(f string) {
var e err
e = testFunc(f)
ch <- e
}(f)
}
for range items {
e := <-ch
fmt.Println("Error: ", e)
if e != nil {
// DO Something
}
}
}
func testFunc(item string) error {
if item == "1" {
return errors.New("err")
}
return nil
}

How to return the success or failure result in the goroutine to the main function and fmt.Println the number of successes

I want to write the contents of the process function with a function different from the main function.
I want to count the number of urls that did not result in an error with "resp, err: = client.Do (req)".
I want to write the number of successes before fmt.Println ("Finish!") In the main function.
What should I do?
func main() {
site_list := [][]string{
{"site1","https://www.aaaa"},
{"site2","https://www.bbbb"},
{"site3","https://www.cccc"},
{"site4","https://www.dddd"},
}
var wg sync.WaitGroup
maxChan := make(chan bool, 2)
for _, v := range site_list {
maxChan <- true
wg.Add(1)
go process(v[1], maxChan, &wg)
}
wg.Wait()
fmt.Println("Finish!")
}
func process(site_url string, maxChan chan bool, wg *sync.WaitGroup) {
defer wg.Done()
defer func(maxChan chan bool) {
<-maxChan
}(maxChan)
req, err := http.NewRequest("GET", site_url, nil)
client := new(http.Client)
resp, err := client.Do(req)
if err != nil {
fmt.Println("client.Doエラー" + site_url)
return
}
defer resp.Body.Close()
}
As per my comment above, create a custom work item struct:
type urlItem struct {
Name string
Url string
Err error
}
Use pointers, so that the process() function can update the underlying struct:
site_list := []*urlItem{
&urlItem{"site1", "https://www.aaaa", nil},
&urlItem{"site2", "https://www.bbbb", nil},
&urlItem{"site3", "https://www.cccc", nil},
&urlItem{"site4", "https://www.dddd", nil},
}
process() can then safely update it's particular work item (without the need for any mutexs or synchronization) as each go-routine gets a unique work item.
func process(site *urlItem, maxChan chan bool, wg *sync.WaitGroup) { /* ... */ }
Then tally the go-routine errors at the end:
var failCount int
for _, v := range site_list {
if v.Err != nil {
failCount++
}
}
Playground Example

Confusion regarding channel directions and blocking in Go

In a function definition, if a channel is an argument without a direction, does it have to send or receive something?
func makeRequest(url string, ch chan<- string, results chan<- string) {
start := time.Now()
resp, err := http.Get(url)
defer resp.Body.Close()
if err != nil {
fmt.Printf("%v", err)
}
resp, err = http.Post(url, "text/plain", bytes.NewBuffer([]byte("Hey")))
defer resp.Body.Close()
secs := time.Since(start).Seconds()
if err != nil {
fmt.Printf("%v", err)
}
// Cannot move past this.
ch <- fmt.Sprintf("%f", secs)
results <- <- ch
}
func MakeRequestHelper(url string, ch chan string, results chan string, iterations int) {
for i := 0; i < iterations; i++ {
makeRequest(url, ch, results)
}
for i := 0; i < iterations; i++ {
fmt.Println(<-ch)
}
}
func main() {
args := os.Args[1:]
threadString := args[0]
iterationString := args[1]
url := args[2]
threads, err := strconv.Atoi(threadString)
if err != nil {
fmt.Printf("%v", err)
}
iterations, err := strconv.Atoi(iterationString)
if err != nil {
fmt.Printf("%v", err)
}
channels := make([]chan string, 100)
for i := range channels {
channels[i] = make(chan string)
}
// results aggregate all the things received by channels in all goroutines
results := make(chan string, iterations*threads)
for i := 0; i < threads; i++ {
go MakeRequestHelper(url, channels[i], results, iterations)
}
resultSlice := make([]string, threads*iterations)
for i := 0; i < threads*iterations; i++ {
resultSlice[i] = <-results
}
}
In the above code,
ch <- or <-results
seems to be blocking every goroutine that executes makeRequest.
I am new to concurrency model of Go. I understand that sending to and receiving from a channel blocks but find it difficult what is blocking what in this code.
I'm not really sure that you are doing... It seems really convoluted. I suggest you read up on how to use channels.
https://tour.golang.org/concurrency/2
That being said you have so much going on in your code that it was much easier to just gut it to something a bit simpler. (It can be simplified further). I left comments to understand the code.
package main
import (
"fmt"
"io/ioutil"
"log"
"net/http"
"sync"
"time"
)
// using structs is a nice way to organize your code
type Worker struct {
wg sync.WaitGroup
semaphore chan struct{}
result chan Result
client http.Client
}
// group returns so that you don't have to send to many channels
type Result struct {
duration float64
results string
}
// closing your channels will stop the for loop in main
func (w *Worker) Close() {
close(w.semaphore)
close(w.result)
}
func (w *Worker) MakeRequest(url string) {
// a semaphore is a simple way to rate limit the amount of goroutines running at any single point of time
// google them, Go uses them often
w.semaphore <- struct{}{}
defer func() {
w.wg.Done()
<-w.semaphore
}()
start := time.Now()
resp, err := w.client.Get(url)
if err != nil {
log.Println("error", err)
return
}
defer resp.Body.Close()
// don't have any examples where I need to also POST anything but the point should be made
// resp, err = http.Post(url, "text/plain", bytes.NewBuffer([]byte("Hey")))
// if err != nil {
// log.Println("error", err)
// return
// }
// defer resp.Body.Close()
secs := time.Since(start).Seconds()
b, err := ioutil.ReadAll(resp.Body)
if err != nil {
log.Println("error", err)
return
}
w.result <- Result{duration: secs, results: string(b)}
}
func main() {
urls := []string{"https://facebook.com/", "https://twitter.com/", "https://google.com/", "https://youtube.com/", "https://linkedin.com/", "https://wordpress.org/",
"https://instagram.com/", "https://pinterest.com/", "https://wikipedia.org/", "https://wordpress.com/", "https://blogspot.com/", "https://apple.com/",
}
workerNumber := 5
worker := Worker{
semaphore: make(chan struct{}, workerNumber),
result: make(chan Result),
client: http.Client{Timeout: 5 * time.Second},
}
// use sync groups to allow your code to wait for
// all your goroutines to finish
for _, url := range urls {
worker.wg.Add(1)
go worker.MakeRequest(url)
}
// by declaring wait and close as a seperate goroutine
// I can get to the for loop below and iterate on the results
// in a non blocking fashion
go func() {
worker.wg.Wait()
worker.Close()
}()
// do something with the results channel
for res := range worker.result {
fmt.Printf("Request took %2.f seconds.\nResults: %s\n\n", res.duration, res.results)
}
}
The channels in channels are nil (no make is executed; you make the slice but not the channels), so any send or receive will block. I'm not sure exactly what you're trying to do here, but that's the basic problem.
See https://golang.org/doc/effective_go.html#channels for an explanation of how channels work.

which is one better code for goroutine overhead

I want to make Reader.Read concurrent with channel communication.
so I made two ways to run it
1:
type ReturnRead struct {
n int
err error
}
type ReadGoSt struct {
Returnc <-chan ReturnRead
Nextc chan struct{}
}
func (st *ReadGoSt) Close() {
defer func() {
recover()
}()
close(st.Next)
}
func ReadGo(r io.Reader, b []byte) *ReadGoSt {
returnc := make(chan ReturnRead)
nextc := make(chan bool)
go func() {
for range nextc {
n, err := r.Read(b)
returnc <- ReturnRead{n, err}
if err != nil {
return
}
}
}()
return &ReadGoSt{returnc, nextc}
}
2:
func ReadGo(r io.Reader, b []byte) <-chan ReturnRead {
returnc := make(chan ReturnRead)
go func() {
n, err := r.Read(b)
returnc <- ReturnRead{n, err}
}()
return returnc
}
i think code 2 creates too many overhead
which is the better code? 1? 2?
Code 1 is better and probably faster. Code 2 will just read once. But I think both solutions are not the best. you should loop over the read and send back only the bytes readed.
Something like : http://play.golang.org/p/zRPXOtdgWD

Goroutines not exiting when data channel is closed

I'm trying to follow along the bounded goroutine example that is posted at http://blog.golang.org/pipelines/bounded.go. The problem that I'm having is that if there are more workers spun up then the amount of work to do, the extra workers never get cancelled. Everything else seems to work, the values get computed and logged, but when I close the groups channel, the workers just hang at the range statement.
I guess what I don't understand (in both my code and the example code) is how do the workers know when there is no more work to do and that they should exit?
Update
A working (i.e. non-working) example is posted at http://play.golang.org/p/T7zBCYLECp. It shows the deadlock on the workers since they are all asleep and there is no work to do. What I'm confused about is that I think the example code would have the same problem.
Here is the code that I'm currently using:
// Creates a pool of workers to do a bunch of computations
func computeAll() error {
done := make(chan struct{})
defer close(done)
groups, errc := findGroups(done)
// start a fixed number of goroutines to schedule with
const numComputers = 20
c := make(chan result)
var wg sync.WaitGroup
wg.Add(numComputers)
for i := 0; i < numComputers; i++ {
go func() {
compute(done, groups, c)
wg.Done()
}()
}
go func() {
wg.Wait()
close(c)
}()
// log the results of the computation
for r := range c { // log the results }
if err := <-errc; err != nil {
return err
}
return nil
}
Here is the code that fills up the channel with data:
// Retrieves the groups of data the must be computed
func findGroups(done <-chan struct{}) (<-chan model, <-chan error) {
groups := make(chan model)
errc := make(chan error, 1)
go func() {
// close the groups channel after find returns
defer close(groups)
group, err := //... code to get the group ...
if err == nil {
// add the group to the channel
select {
case groups <- group:
}
}
}()
return groups, errc
}
And here is the code that reads the channel to do the computations.
// Computes the results for the groups of data
func compute(done <-chan struct{}, groups <-chan model, c chan<- result) {
for group := range groups {
value := compute(group)
select {
case c <- result{value}:
case <-done:
return
}
}
}
Because you're trying to read from errc and it's empty unless there's an error.
//edit
computeAll() will always block on <- errc if there are no errors, another approach is to use something like:
func computeAll() (err error) {
.........
select {
case err = <-errc:
default: //don't block
}
return
}
Try to close the errc as OneOfOne says
go func() {
wg.Wait()
close(c)
close(errc)
}()
// log the results of the computation
for r := range c { // log the results }
if err := range errc {
if err != nil {
return err
}
}

Resources