I'm trying to write to a channel as last action in a goroutine function.
Unfortunately this is not working. And the waitGroup is never done.
import (
"sync"
"github.com/SlyMarbo/rss"
"fmt"
)
func main() {
urls := []string{"http://rss.cnn.com/rss/edition.rss", "http://rss.time.com/web/time/rss/top/index.xml"}
var c = make(chan string)
var wg sync.WaitGroup
for _, url := range urls {
wg.Add(1)
go receiveRss(url, &wg, c)
}
wg.Wait()
fmt.Println("==============DONE=================")
}
func receiveRss(url string, wg *sync.WaitGroup, c chan string) {
defer wg.Done()
feed, err := rss.Fetch(url)
if err != nil {
fmt.Println("Failed to retrieve RSS feed", err)
}
items := feed.Items
for _, item := range items {
c <- item.Title
}
}
When replacing c <- item.Title with fmt.Println(item.Title) the deferred function is called and DONE is printed.
The problem is that I was only writing to the channel. Never reading from it.
Without doing this the channel would be useless.
The solution to this is:
Reading from the channel after the loop which starts the goroutines:
for title := range c {
fmt.Println(title)
}
This then causes an endless loop if the channel is never closed.
So I just close the channel after writing to it:
close(c)
Here is the whole code:
func main() {
urls := []string{"http://rss.cnn.com/rss/edition.rss", "http://rss.time.com/web/time/rss/top/index.xml"}
var c = make(chan []string)
var wg sync.WaitGroup
for _, url := range urls {
wg.Add(1)
go receiveRss(url, &wg, c)
}
for title := range c {
fmt.Println(title)
}
wg.Wait()
fmt.Println("==============DONE=================")
}
func receiveRss(url string, wg *sync.WaitGroup, c chan []string) {
defer wg.Done()
feed, err := rss.Fetch(url)
if err != nil {
fmt.Println("Failed to retrieve RSS feed", err)
}
items := feed.Items
var titles []string
for _, item := range items {
titles = append(titles, item.Title)
}
c <- titles
close(c)
}
Related
I am having issue while using waitgroup with the buffered channel. The problem is waitgroup closes before channel is read completely, which make my channel is half read and break in between.
func main() {
var wg sync.WaitGroup
var err error
start := time.Now()
students := make([]studentDetails, 0)
studentCh := make(chan studentDetail, 10000)
errorCh := make(chan error, 1)
wg.Add(1)
go s.getDetailStudents(rCtx, studentCh , errorCh, &wg, s.Link, false)
go func(ch chan studentDetail, e chan error) {
LOOP:
for {
select {
case p, ok := <-ch:
if ok {
L.Printf("Links %s: [%s]\n", p.title, p.link)
students = append(students, p)
} else {
L.Print("Closed channel")
break LOOP
}
case err = <-e:
if err != nil {
break
}
}
}
}(studentCh, errorCh)
wg.Wait()
close(studentCh)
close(errorCh)
L.Warnln("closed: all wait-groups completed!")
L.Warnf("total items fetched: %d", len(students))
elapsed := time.Since(start)
L.Warnf("operation took %s", elapsed)
}
The problem is this function is recursive. I mean some http call to fetch students and then make more calls depending on condition.
func (s Student) getDetailStudents(rCtx context.Context, content chan<- studentDetail, errorCh chan<- error, wg *sync.WaitGroup, url string, subSection bool) {
util.MustNotNil(rCtx)
L := logger.GetLogger(rCtx)
defer func() {
L.Println("Closing all waitgroup!")
wg.Done()
}()
wc := getWC()
httpClient := wc.Registry.MustHTTPClient()
res, err := httpClient.Get(url)
if err != nil {
L.Fatal(err)
}
defer res.Body.Close()
if res.StatusCode != 200 {
L.Errorf("status code error: %d %s", res.StatusCode, res.Status)
errorCh <- errors.New("service_status_code")
return
}
// parse response and return error if found some through errorCh as done above.
// decide page subSection based on response if it is more.
if !subSection {
wg.Add(1)
go s.getDetailStudents(rCtx, content, errorCh, wg, link, true)
// L.Warnf("total pages found %d", pageSub.Length()+1)
}
// Find students from response list and parse each Student
students := s.parseStudentItemList(rCtx, item)
for _, student := range students {
content <- student
}
L.Warnf("Calling HTTP Service for %q with total %d record", url, elementsSub.Length())
}
Variables are changed to avoid original code base.
The problem is students are read randomly as soon as Waitgroup complete. I am expecting to hold the execution until all students are read, In case of error it should break as soon error encounter.
You need to know when the receiving goroutine completes. The WaitGroup does that for the generating goroutine. So, you can use two waitgroups:
wg.Add(1)
go s.getDetailStudents(rCtx, studentCh , errorCh, &wg, s.Link, false)
wgReader.Add(1)
go func(ch chan studentDetail, e chan error) {
defer wgReader.Done()
...
}
wg.Wait()
close(studentCh)
close(errorCh)
wgReader.Wait() // Wait for the readers to complete
Since you are using buffered channels you can retrieve the remaining values after closing the channel. You will also need a mechanism to prevent your main function from exiting too early while the reader is still doing work ,as #Burak Serdar has advised.
I restructured the code to give a working example but it should get the point across.
package main
import (
"context"
"log"
"sync"
"time"
)
type studentDetails struct {
title string
link string
}
func main() {
var wg sync.WaitGroup
var err error
students := make([]studentDetails, 0)
studentCh := make(chan studentDetails, 10000)
errorCh := make(chan error, 1)
start := time.Now()
wg.Add(1)
go getDetailStudents(context.TODO(), studentCh, errorCh, &wg, "http://example.com", false)
LOOP:
for {
select {
case p, ok := <-studentCh:
if ok {
log.Printf("Links %s: [%s]\n", p.title, p.link)
students = append(students, p)
} else {
log.Println("Draining student channel")
for p := range studentCh {
log.Printf("Links %s: [%s]\n", p.title, p.link)
students = append(students, p)
}
break LOOP
}
case err = <-errorCh:
if err != nil {
break LOOP
}
case <-wrapWait(&wg):
close(studentCh)
}
}
close(errorCh)
elapsed := time.Since(start)
log.Printf("operation took %s", elapsed)
}
func getDetailStudents(rCtx context.Context, content chan<- studentDetails, errorCh chan<- error, wg *sync.WaitGroup, url string, subSection bool) {
defer func() {
log.Println("Closing")
wg.Done()
}()
if !subSection {
wg.Add(1)
go getDetailStudents(rCtx, content, errorCh, wg, url, true)
// L.Warnf("total pages found %d", pageSub.Length()+1)
}
content <- studentDetails{
title: "title",
link: "link",
}
}
// helper function to allow using WaitGroup in a select
func wrapWait(wg *sync.WaitGroup) <-chan struct{} {
out := make(chan struct{})
go func() {
wg.Wait()
out <- struct{}{}
}()
return out
}
wg.Add(1)
go func(){
defer wg.Done()
// I do not think that you need a recursive function.
// this function overcomplicated.
s.getDetailStudents(rCtx, studentCh , errorCh, &wg, s.Link, false)
}(...)
wg.Add(1)
go func(ch chan studentDetail, e chan error) {
defer wg.Done()
...
}(...)
wg.Wait()
close(studentCh)
close(errorCh)
This should solve the problem. s.getDetailStudents function must be simplified. Making it recursive does not have any benefit.
I want to write the contents of the process function with a function different from the main function.
I want to count the number of urls that did not result in an error with "resp, err: = client.Do (req)".
I want to write the number of successes before fmt.Println ("Finish!") In the main function.
What should I do?
func main() {
site_list := [][]string{
{"site1","https://www.aaaa"},
{"site2","https://www.bbbb"},
{"site3","https://www.cccc"},
{"site4","https://www.dddd"},
}
var wg sync.WaitGroup
maxChan := make(chan bool, 2)
for _, v := range site_list {
maxChan <- true
wg.Add(1)
go process(v[1], maxChan, &wg)
}
wg.Wait()
fmt.Println("Finish!")
}
func process(site_url string, maxChan chan bool, wg *sync.WaitGroup) {
defer wg.Done()
defer func(maxChan chan bool) {
<-maxChan
}(maxChan)
req, err := http.NewRequest("GET", site_url, nil)
client := new(http.Client)
resp, err := client.Do(req)
if err != nil {
fmt.Println("client.Doエラー" + site_url)
return
}
defer resp.Body.Close()
}
As per my comment above, create a custom work item struct:
type urlItem struct {
Name string
Url string
Err error
}
Use pointers, so that the process() function can update the underlying struct:
site_list := []*urlItem{
&urlItem{"site1", "https://www.aaaa", nil},
&urlItem{"site2", "https://www.bbbb", nil},
&urlItem{"site3", "https://www.cccc", nil},
&urlItem{"site4", "https://www.dddd", nil},
}
process() can then safely update it's particular work item (without the need for any mutexs or synchronization) as each go-routine gets a unique work item.
func process(site *urlItem, maxChan chan bool, wg *sync.WaitGroup) { /* ... */ }
Then tally the go-routine errors at the end:
var failCount int
for _, v := range site_list {
if v.Err != nil {
failCount++
}
}
Playground Example
I am playing around with Golang and I created this little app to make several concurrent api calls using goroutines.
While the app works, after the calls complete, the app gets stuck, which makes sense because it cannot exit the range c loop because the channel is not closed.
I am not sure where to better close the channel in this pattern.
package main
import "fmt"
import "net/http"
func main() {
links := []string{
"https://github.com/fabpot",
"https://github.com/andrew",
"https://github.com/taylorotwell",
"https://github.com/egoist",
"https://github.com/HugoGiraudel",
}
checkUrls(links)
}
func checkUrls(urls []string) {
c := make(chan string)
for _, link := range urls {
go checkUrl(link, c)
}
for msg := range c {
fmt.Println(msg)
}
close(c) //this won't get hit
}
func checkUrl(url string, c chan string) {
_, err := http.Get(url)
if err != nil {
c <- "We could not reach:" + url
} else {
c <- "Success reaching the website:" + url
}
}
You close a channel when there are no more values to send, so in this case it's when all checkUrl goroutines have completed.
var wg sync.WaitGroup
func checkUrls(urls []string) {
c := make(chan string)
for _, link := range urls {
wg.Add(1)
go checkUrl(link, c)
}
go func() {
wg.Wait()
close(c)
}()
for msg := range c {
fmt.Println(msg)
}
}
func checkUrl(url string, c chan string) {
defer wg.Done()
_, err := http.Get(url)
if err != nil {
c <- "We could not reach:" + url
} else {
c <- "Success reaching the website:" + url
}
}
(Note that the error from http.Get is only going to reflect connection and protocol errors. It is not going to contain http server errors if you're expecting those too, which you must be seeing how you're checking for paths and not just hosts.)
When writing programs in Go using channels and goroutines always think about who (which function) owns a channel. I prefer the practice of letting the function who owns a channel close it. If i were to write this i would do as shown below.
Note: A better way to handle situations like this is the Fan-out, fan-in concurrency pattern. refer(https://blog.golang.org/pipelines)Go Concurrency Patterns
package main
import "fmt"
import "net/http"
import "sync"
func main() {
links := []string{
"https://github.com/fabpot",
"https://github.com/andrew",
"https://github.com/taylorotwell",
"https://github.com/egoist",
"https://github.com/HugoGiraudel",
}
processURLS(links)
fmt.Println("End of Main")
}
func processURLS(links []string) {
resultsChan := checkUrls(links)
for msg := range resultsChan {
fmt.Println(msg)
}
}
func checkUrls(urls []string) chan string {
outChan := make(chan string)
go func(urls []string) {
defer close(outChan)
var wg sync.WaitGroup
for _, url := range urls {
wg.Add(1)
go checkUrl(&wg, url, outChan)
}
wg.Wait()
}(urls)
return outChan
}
func checkUrl(wg *sync.WaitGroup, url string, c chan string) {
defer wg.Done()
_, err := http.Get(url)
if err != nil {
c <- "We could not reach:" + url
} else {
c <- "Success reaching the website:" + url
}
}
I need to fetch responses from multiple go routines and put them into an array. I know that channels could be used for this, however I am not sure how I can make sure that all go routines have finished processing the results. Thus I am using a waitgroup.
Code
func main() {
log.Info("Collecting ints")
var results []int32
for _, broker := range e.BrokersByBrokerID {
wg.Add(1)
go getInt32(&wg)
}
wg.Wait()
log.info("Collected")
}
func getInt32(wg *sync.WaitGroup) (int32, error) {
defer wg.Done()
// Just to show that this method may just return an error and no int32
err := broker.Open(config)
if err != nil && err != sarama.ErrAlreadyConnected {
return 0, fmt.Errorf("Cannot connect to broker '%v': %s", broker.ID(), err)
}
defer broker.Close()
return 1003, nil
}
My question
How can I put all the response int32 (which may return an error) into my int32 array, making sure that all go routines have finished their processing work and returned either the error or the int?
If you don't process the return values of the function launched as a goroutine, they are discarded. See What happens to return value from goroutine.
You may use a slice to collect the results, where each goroutine could receive the index to put the results to, or alternatively the address of the element. See Can I concurrently write different slice elements. Note that if you use this, the slice must be pre-allocated and only the element belonging to the goroutine may be written, you can't "touch" other elements and you can't append to the slice.
Or you may use a channel, on which the goroutines send values that include the index or ID of the item they processed, so the collecting goroutine can identify or order them. See How to collect values from N goroutines executed in a specific order?
If processing should stop on the first error encountered, see Close multiple goroutine if an error occurs in one in go
Here's an example how it could look like when using a channel. Note that no waitgroup is needed here, because we know that we expect as many values on the channel as many goroutines we launch.
type result struct {
task int32
data int32
err error
}
func main() {
tasks := []int32{1, 2, 3, 4}
ch := make(chan result)
for _, task := range tasks {
go calcTask(task, ch)
}
// Collect results:
results := make([]result, len(tasks))
for i := range results {
results[i] = <-ch
}
fmt.Printf("Results: %+v\n", results)
}
func calcTask(task int32, ch chan<- result) {
if task > 2 {
// Simulate failure
ch <- result{task: task, err: fmt.Errorf("task %v failed", task)}
return
}
// Simulate success
ch <- result{task: task, data: task * 2, err: nil}
}
Output (try ot on the Go Playground):
Results: [{task:4 data:0 err:0x40e130} {task:1 data:2 err:<nil>} {task:2 data:4 err:<nil>} {task:3 data:0 err:0x40e138}]
I also believe you have to use channel, it must be something like this:
package main
import (
"fmt"
"log"
"sync"
)
var (
BrokersByBrokerID = []int32{1, 2, 3}
)
type result struct {
data string
err string // you must use error type here
}
func main() {
var wg sync.WaitGroup
var results []result
ch := make(chan result)
for _, broker := range BrokersByBrokerID {
wg.Add(1)
go getInt32(ch, &wg, broker)
}
go func() {
for v := range ch {
results = append(results, v)
}
}()
wg.Wait()
close(ch)
log.Printf("collected %v", results)
}
func getInt32(ch chan result, wg *sync.WaitGroup, broker int32) {
defer wg.Done()
if broker == 1 {
ch <- result{err: fmt.Sprintf("error: gor broker 1")}
return
}
ch <- result{data: fmt.Sprintf("broker %d - ok", broker)}
}
Result will look like this:
2019/02/05 15:26:28 collected [{broker 3 - ok } {broker 2 - ok } { error: gor broker 1}]
package main
import (
"fmt"
"log"
"sync"
)
var (
BrokersByBrokerID = []int{1, 2, 3, 4}
)
type result struct {
data string
err string // you must use error type here
}
func main() {
var wg sync.WaitGroup
var results []int
ch := make(chan int)
done := make(chan bool)
for _, broker := range BrokersByBrokerID {
wg.Add(1)
go func(i int) {
defer wg.Done()
ch <- i
if i == 4 {
done <- true
}
}(broker)
}
L:
for {
select {
case v := <-ch:
results = append(results, v)
if len(results) == 4 {
//<-done
close(ch)
break L
}
case _ = <-done:
break
}
}
fmt.Println("STOPPED")
//<-done
wg.Wait()
log.Printf("collected %v", results)
}
Thank cn007b and Edenshaw. My answer is based on their answers.
As Edenshaw commented, need another sync.Waitgroup for goroutine which getting results from channel, or you may get an incomplete array.
package main
import (
"fmt"
"sync"
"encoding/json"
)
type Resp struct {
id int
}
func main() {
var wg sync.WaitGroup
chanRes := make(chan interface{}, 3)
for i := 0; i < 3; i++ {
wg.Add(1)
resp := &Resp{}
go func(i int, resp *Resp) {
defer wg.Done()
resp.id = i
chanRes <- resp
}(i, resp)
}
res := make([]interface{}, 0)
var wg2 sync.WaitGroup
wg2.Add(1)
go func() {
defer wg2.Done()
for v := range chanRes {
res = append(res, v.(*Resp).id)
}
}()
wg.Wait()
close(chanRes)
wg2.Wait()
resStr, _ := json.Marshal(res)
fmt.Println(string(resStr))
}
package main
import (
"fmt"
"log"
"sync"
"time"
)
var (
BrokersByBrokerID = []int{1, 2, 3, 4}
)
type result struct {
data string
err string // you must use error type here
}
func main() {
var wg sync.WaitGroup.
var results []int
ch := make(chan int)
done := make(chan bool)
for _, broker := range BrokersByBrokerID {
wg.Add(1)
go func(i int) {
defer wg.Done()
ch <- i
if i == 4 {
done <- true
}
}(broker)
}
for v := range ch {
results = append(results, v)
if len(results) == 4 {
close(ch)
}
}
fmt.Println("STOPPED")
<-done
wg.Wait()
log.Printf("collected %v", results)
}
</pre>
I have the following golang program;
package main
import (
"fmt"
"net/http"
"time"
)
var urls = []string{
"http://www.google.com/",
"http://golang.org/",
"http://yahoo.com/",
}
type HttpResponse struct {
url string
response *http.Response
err error
status string
}
func asyncHttpGets(url string, ch chan *HttpResponse) {
client := http.Client{}
if url == "http://www.google.com/" {
time.Sleep(500 * time.Millisecond) //google is down
}
fmt.Printf("Fetching %s \n", url)
resp, err := client.Get(url)
u := &HttpResponse{url, resp, err, "fetched"}
ch <- u
fmt.Println("sent to chan")
}
func main() {
fmt.Println("start")
ch := make(chan *HttpResponse, len(urls))
for _, url := range urls {
go asyncHttpGets(url, ch)
}
for i := range ch {
fmt.Println(i)
}
fmt.Println("Im done")
}
Run it on Playground
However when I run it; it hangs (ie the last part that ought to print Im done doesnt run.)
Here's the terminal output;;
$ go run get.go
start
Fetching http://yahoo.com/
Fetching http://golang.org/
Fetching http://www.google.com/
sent to chan
&{http://www.google.com/ 0xc820144120 fetched}
sent to chan
&{http://golang.org/ 0xc82008b710 fetched}
sent to chan
&{http://yahoo.com/ 0xc82008b7a0 fetched}
The problem is that ranging over a channel in a for loop will continue forever unless the channel is closed. If you want to read precisely len(urls) values from the channel, you should loop that many times:
for i := 0; i < len(urls); i++ {
fmt.Println(<-ch)
}
Another good dirty devious trick would be to use sync.WaitGroup and increment it per goroutine and then monitor it with a Wait and after its done it will close your channel allowing the next blocks of code to run, the reason I am offering you this approach is because it gets away from using a static number in a loop like len(urls) so that you can have a dynamic slice that might change and what not.
The reason Wait and close are in their own goroutine is so that your code can reach the for loop to range over your channel
package main
import (
"fmt"
"net/http"
"time"
"sync"
)
var urls = []string{
"http://www.google.com/",
"http://golang.org/",
"http://yahoo.com/",
}
type HttpResponse struct {
url string
response *http.Response
err error
status string
}
func asyncHttpGets(url string, ch chan *HttpResponse, wg *sync.WaitGroup) {
client := http.Client{}
if url == "http://www.google.com/" {
time.Sleep(500 * time.Millisecond) //google is down
}
fmt.Printf("Fetching %s \n", url)
resp, err := client.Get(url)
u := &HttpResponse{url, resp, err, "fetched"}
ch <- u
fmt.Println("sent to chan")
wg.Done()
}
func main() {
fmt.Println("start")
ch := make(chan *HttpResponse, len(urls))
var wg sync.WaitGroup
for _, url := range urls {
wg.Add(1)
go asyncHttpGets(url, ch, &wg)
}
go func() {
wg.Wait()
close(ch)
}()
for i := range ch {
fmt.Println(i)
}
fmt.Println("Im done")
}