I have the following code. Pay special attention to the anonymous function:
func saveMatterNodes(matterId int, nodes []doculaw.LitigationNode) (bool, error) {
var (
err error
resp *http.Response
)
// Do this in multiple threads
for _, node := range nodes {
fmt.Println("in loops")
go func() {
postValues := doculaw.LitigationNode{
Name: node.Name,
Description: node.Description,
Days: node.Days,
Date: node.Date,
IsFinalStep: false,
Completed: false,
Matter: matterId}
b := new(bytes.Buffer)
json.NewEncoder(b).Encode(postValues)
resp, err = http.Post("http://127.0.0.1:8001/matterNode/", "application/json", b)
io.Copy(os.Stdout, resp.Body)
fmt.Println("Respone from http post", resp)
if err != nil {
fmt.Println(err)
}
}()
}
if err != nil {
return false, err
} else {
return true, nil
}
}
If I remove the go func() {}() part and just leave the code in between it seems to execute fine but the moment I add it back it does not execute. Any idea why that is? I initially thought maybe because it's executing on a different thread but this doesn't seem to be the case as I can see on my webservice access logs that it is not executing.
I think this behaviour is because function never yields back to main thread ( After you launch goroutines, there is no construct in program to wait for them to finish their work).
Use of channels, IO operations, sync.WaitGroup etc can yield control back to the main thread.
You may want to try sync.WaitGroup
Example: https://play.golang.org/p/Zwn0YBynl2
Related
I wish to implement parallel api calling in golang using go routines. Once the requests are fired,
I need to wait for all responses (which take different time).
If any of the request fails and returns an error, I wish to end (or pretend) the routines.
I also want to have a timeout value associated with each go routine (or api call).
I have implemented the below for 1 and 2, but need help as to how can I implement 3. Also, feedback on 1 and 2 will also help.
package main
import (
"errors"
"fmt"
"sync"
"time"
)
func main() {
var wg sync.WaitGroup
c := make(chan interface{}, 1)
c2 := make(chan interface{}, 1)
err := make(chan interface{})
wg.Add(1)
go func() {
defer wg.Done()
result, e := doSomeWork()
if e != nil {
err <- e
return
}
c <- result
}()
wg.Add(1)
go func() {
defer wg.Done()
result2, e := doSomeWork2()
if e != nil {
err <- e
return
}
c2 <- result2
}()
go func() {
wg.Wait()
close(c)
close(c2)
close(err)
}()
for e := range err {
// here error happend u could exit your caller function
fmt.Println("Error==>", e)
return
}
fmt.Println(<-c, <-c2)
}
// mimic api call 1
func doSomeWork() (function1, error) {
time.Sleep(10 * time.Second)
obj := function1{"ABC", "29"}
return obj, nil
}
type function1 struct {
Name string
Age string
}
// mimic api call 2
func doSomeWork2() (function2, error) {
time.Sleep(4 * time.Second)
r := errors.New("Error Occured")
if 1 == 2 {
fmt.Println(r)
}
obj := function2{"Delhi", "Delhi"}
// return error as nil for now
return obj, nil
}
type function2 struct {
City string
State string
}
Thanks in advance.
This kind of fork-and-join pattern is exactly what golang.org/x/sync/errgroup was designed for. (Identifying the appropriate “first error” from a group of goroutines can be surprisingly subtle.)
You can use errgroup.WithContext to obtain a context.Context that is cancelled if any of the goroutines in the group returns. The (*Group).Wait method waits for the goroutines to complete and returns the first error.
For your example, that might look something like: https://play.golang.org/p/jqYeb4chHCZ.
You can then inject a timeout within any given call by wrapping the Context using context.WithTimeout.
(However, in my experience if you've plumbed in cancellation correctly, explicit timeouts are almost never helpful — the end user can cancel explicitly if they get tired of waiting, and you probably don't want to promote degraded service to a complete outage if something starts to take just a bit longer than you expected.)
To support timeouts and cancelation of goroutine work, the standard mechanism is to use context.Context.
ctx := context.Background() // root context
// wrap the context with a timeout and/or cancelation mechanism
ctx, cancel := context.WithTimeout(ctx, 5*time.Second) // with timeout or cancel
//ctx, cancel := context.WithCancel(ctx) // no timeout just cancel
defer cancel() // avoid memory leak if we never cancel/timeout
Next your worker goroutines need to support taking and monitoring the state of the ctx. To do this in parallel with the time.Sleep (to mimic a long computation), convert the sleep to a channel based solution:
// mimic api call 1
func doSomeWork(ctx context.Context) (function1, error) {
//time.Sleep(10 * time.Second)
select {
case <-time.After(10 * time.Second):
// wait completed
case <-ctx.Done():
return function1{}, ctx.Err()
}
// ...
}
And if one worker goroutine fails, to signal to the other worker that the request should be aborted, simply call the cancel() function.
result, e := doSomeWork(ctx)
if e != nil {
cancel() // <- add this
err <- e
return
}
Pulling this all together:
https://play.golang.org/p/1Kpe_tre7XI
EDIT: the sleep example above is obviously a contrived example of how to abort a "fake" task. In the real world, http or SQL DB calls would be involve - and since go 1.7 & 1.8 - the standard library added context support to any of these potentially blocking calls:
func doSomeWork(ctx context.Context) (error)
// DB
db, err := sql.Open("mysql", "...") // check err
//rows, err := db.Query("SELECT age from users", age)
rows, err := db.QueryContext(ctx, "SELECT age from users", age)
if err != nil {
return err // will return with error if context is canceled
}
// http
// req, err := http.NewRequest("GET", "http://example.com", nil)
req, err := http.NewRequestWithContext(ctx, "GET", "http://example.com", nil) // check err
resp, err := http.DefaultClient.Do(req)
if err != nil {
return err // will return with error if context is canceled
}
}
EDIT (2): to poll a context's state without blocking, leverage select's default branch:
select {
case <-ctx.Done():
return ctx.Err()
default:
// if ctx is not done - this branch is used
}
the default branch can optional have code in it, but even if it is empty of code it's presence will prevent blocking - and thus just poll the status of the context in that instant of time.
Let's say I have a function IsAPrimaryColour() which works by calling three other functions IsRed(), IsGreen() and IsBlue(). Since the three functions are quite independent of one another, they can run concurrently. The return conditions are:
If any of the three functions returns true, IsAPrimaryColour()
should also return true. There is no need to wait for the other
functions to finish. That is: IsPrimaryColour() is true if IsRed() is true OR IsGreen() is true OR IsBlue() is true
If all functions return false, IsAPrimaryColour() should also return
false. That is: IsPrimaryColour() is false if IsRed() is false AND IsGreen() is false AND IsBlue() is false
If any of the three functions returns an error, IsAPrimaryColour()
should also return the error. There is no need to wait for the other
functions to finish, or to collect any other errors.
The thing I'm struggling with is how to exit the function if any other three functions return true, but also to wait for all three to finish if they all return false. If I use a sync.WaitGroup object, I will need to wait for all 3 go routines to finish before I can return from the calling function.
Therefore, I'm using a loop counter to keep track of how many times I have received a message on a channel and existing the program once I have received all 3 messages.
https://play.golang.org/p/kNfqWVq4Wix
package main
import (
"errors"
"fmt"
"time"
)
func main() {
x := "something"
result, err := IsAPrimaryColour(x)
if err != nil {
fmt.Printf("Error: %v\n", err)
} else {
fmt.Printf("Result: %v\n", result)
}
}
func IsAPrimaryColour(value interface{}) (bool, error) {
found := make(chan bool, 3)
errors := make(chan error, 3)
defer close(found)
defer close(errors)
var nsec int64 = time.Now().UnixNano()
//call the first function, return the result on the 'found' channel and any errors on the 'errors' channel
go func() {
result, err := IsRed(value)
if err != nil {
errors <- err
} else {
found <- result
}
fmt.Printf("IsRed done in %f nanoseconds \n", float64(time.Now().UnixNano()-nsec))
}()
//call the second function, return the result on the 'found' channel and any errors on the 'errors' channel
go func() {
result, err := IsGreen(value)
if err != nil {
errors <- err
} else {
found <- result
}
fmt.Printf("IsGreen done in %f nanoseconds \n", float64(time.Now().UnixNano()-nsec))
}()
//call the third function, return the result on the 'found' channel and any errors on the 'errors' channel
go func() {
result, err := IsBlue(value)
if err != nil {
errors <- err
} else {
found <- result
}
fmt.Printf("IsBlue done in %f nanoseconds \n", float64(time.Now().UnixNano()-nsec))
}()
//loop counter which will be incremented every time we read a value from the 'found' channel
var counter int
for {
select {
case result := <-found:
counter++
fmt.Printf("received a value on the results channel after %f nanoseconds. Value of counter is %d\n", float64(time.Now().UnixNano()-nsec), counter)
if result {
fmt.Printf("some goroutine returned true\n")
return true, nil
}
case err := <-errors:
if err != nil {
fmt.Printf("some goroutine returned an error\n")
return false, err
}
default:
}
//check if we have received all 3 messages on the 'found' channel. If so, all 3 functions must have returned false and we can thus return false also
if counter == 3 {
fmt.Printf("all goroutines have finished and none of them returned true\n")
return false, nil
}
}
}
func IsRed(value interface{}) (bool, error) {
return false, nil
}
func IsGreen(value interface{}) (bool, error) {
time.Sleep(time.Millisecond * 100) //change this to a value greater than 200 to make this function take longer than IsBlue()
return true, nil
}
func IsBlue(value interface{}) (bool, error) {
time.Sleep(time.Millisecond * 200)
return false, errors.New("something went wrong")
}
Although this works well enough, I wonder if I'm not overlooking some language feature to do this in a better way?
errgroup.WithContext can help simplify the concurrency here.
You want to stop all of the goroutines if an error occurs, or if a result is found. If you can express “a result is found” as a distinguished error (along the lines of io.EOF), then you can use errgroup's built-in “cancel on first error” behavior to shut down the whole group:
func IsAPrimaryColour(ctx context.Context, value interface{}) (bool, error) {
var nsec int64 = time.Now().UnixNano()
errFound := errors.New("result found")
g, ctx := errgroup.WithContext(ctx)
g.Go(func() error {
result, err := IsRed(ctx, value)
if result {
err = errFound
}
fmt.Printf("IsRed done in %f nanoseconds \n", float64(time.Now().UnixNano()-nsec))
return err
})
…
err := g.Wait()
if err == errFound {
fmt.Printf("some goroutine returned errFound\n")
return true, nil
}
if err != nil {
fmt.Printf("some goroutine returned an error\n")
return false, err
}
fmt.Printf("all goroutines have finished and none of them returned true\n")
return false, nil
}
(https://play.golang.org/p/MVeeBpDv4Mn)
some remarks,
you dont need to close the channels, you know before hand the expected count of signals to read. This is sufficient for an exit condition.
you dont need to duplicate manual function calls, use a slice.
since you use a slice, you dont even need a counter, or a static value of 3, just look at the length of your func slice.
that default case into the switch is useless. just block on the input you are waiting for.
So once you got ride of all the fat, the code looks like
func IsAPrimaryColour(value interface{}) (bool, error) {
fns := []func(interface{}) (bool, error){IsRed, IsGreen, IsBlue}
found := make(chan bool, len(fns))
errors := make(chan error, len(fns))
for i := 0; i < len(fns); i++ {
fn := fns[i]
go func() {
result, err := fn(value)
if err != nil {
errors <- err
return
}
found <- result
}()
}
for i := 0; i < len(fns); i++ {
select {
case result := <-found:
if result {
return true, nil
}
case err := <-errors:
if err != nil {
return false, err
}
}
}
return false, nil
}
you dont need to obsereve the time at the each and every async calls, just observe the time the overall caller took to return.
func main() {
now := time.Now()
x := "something"
result, err := IsAPrimaryColour(x)
if err != nil {
fmt.Printf("Error: %v\n", err)
} else {
fmt.Printf("Result: %v\n", result)
}
fmt.Println("it took", time.Since(now))
}
https://play.golang.org/p/bARHS6c6m1c
The idiomatic way to handle multiple concurrent function calls, and cancel any outstanding after a condition, is with the use of a context value. Something like this:
func operation1(ctx context.Context) bool { ... }
func operation2(ctx context.Context) bool { ... }
func operation3(ctx context.Context) bool { ... }
func atLeastOneSuccess() bool {
ctx, cancel := context.WithCancel(context.Background()
defer cancel() // Ensure any functions still running get the signal to stop
results := make(chan bool, 3) // A channel to send results
go func() {
results <- operation1(ctx)
}()
go func() {
results <- operation2(ctx)
}()
go func() {
results <- operation3(ctx)
}()
for i := 0; i < 3; i++ {
result := <-results
if result {
// One of the operations returned success, so we'll return that
// and let the deferred call to cancel() tell any outstanding
// functions to abort.
return true
}
}
// We've looped through all return values, and they were all false
return false
}
Of course this assumes that each of the operationN functions actually honors a canceled context. This answer discusses how to do that.
You don't have to block the main goroutine on the Wait, you could block something else, for example:
doneCh := make(chan struct{}{})
go func() {
wg.Wait()
close(doneCh)
}()
Then you can wait on doneCh in your select to see if all the routines have finished.
I've been reading a few go blogs and and more recently I stubbled upon Peter Bourgon's talk titled "Ways to do things". He shows a few examples of the actor pattern for concurrency in GO. Here is a handler example using such pattern:
func (a *API) handleNext(w http.ResponseWriter, r *http.Request) {
var (
notFound = make(chan struct{})
otherError = make(chan error)
nextID = make(chan string)
)
a.action <- func() {
s, err := a.log.Oldest()
if err == ErrNoSegmentsAvailable {
close(notFound)
return
}
if err != nil {
otherError <- err
return
}
id := uuid.New()
a.pending[id] = pendingSegment{s, time.Now().Add(a.timeout), false}
nextID <- id
}
select {
case <-notFound:
http.NotFound(w, r)
case err := <-otherError:
http.Error(w, err.Error(), http.StatusInternalServerError)
case id := <-nextID:
fmt.Fprint(w, id)
}
}
And there's a loop behind the scenes listening for the action channel:
func (a *API) loop() {
for {
select {
case f := <-a.action:
f()
}
}
}
My question is what is the benefit to all of this? The handler isn't any faster because it is still blocking until some action in the action func returns something to it. Which is essentially the same thing as just returning the function from outside the go routine. What am I missing here?
The benefits are not to a single call but to the sum of all calls.
For example you can use this to limit actual execution to a single goroutine and thereby avoid all the problems concurrent execution would bring with it.
For example I use this pattern to synchronise all usage of a connection to a hardware device that talks serial.
I am trying to catch errors from a group of goroutines using a channel, but the channel enters an infinite loop, starts consuming CPU.
func UnzipFile(f *bytes.Buffer, location string) error {
zipReader, err := zip.NewReader(bytes.NewReader(f.Bytes()), int64(f.Len()))
if err != nil {
return err
}
if err := os.MkdirAll(location, os.ModePerm); err != nil {
return err
}
errorChannel := make(chan error)
errorList := []error{}
go errorChannelWatch(errorChannel, errorList)
fileWaitGroup := &sync.WaitGroup{}
for _, file := range zipReader.File {
fileWaitGroup.Add(1)
go writeZipFileToLocal(file, location, errorChannel, fileWaitGroup)
}
fileWaitGroup.Wait()
close(errorChannel)
log.Println(errorList)
return nil
}
func errorChannelWatch(ch chan error, list []error) {
for {
select {
case err := <- ch:
list = append(list, err)
}
}
}
func writeZipFileToLocal(file *zip.File, location string, ch chan error, wg *sync.WaitGroup) {
defer wg.Done()
zipFilehandle, err := file.Open()
if err != nil {
ch <- err
return
}
defer zipFilehandle.Close()
if file.FileInfo().IsDir() {
if err := os.MkdirAll(filepath.Join(location, file.Name), os.ModePerm); err != nil {
ch <- err
}
return
}
localFileHandle, err := os.OpenFile(filepath.Join(location, file.Name), os.O_WRONLY|os.O_CREATE|os.O_TRUNC, file.Mode())
if err != nil {
ch <- err
return
}
defer localFileHandle.Close()
if _, err := io.Copy(localFileHandle, zipFilehandle); err != nil {
ch <- err
return
}
ch <- fmt.Errorf("Test error")
}
So I am looping a slice of files and writing them to my disk, when there is an error I report back to the errorChannel to save that error into a slice.
I use a sync.WaitGroup to wait for all goroutines and when they are done I want to print errorList and check if there was any error during the execution.
The list is always empty, even if I add ch <- fmt.Errorf("test") at the end of writeZipFileToLocal and the channel always hangs up.
I am not sure what I am missing here.
1. For the first point, the infinite loop:
Citing from golang language spec:
A receive operation on a closed channel can always proceed
immediately, yielding the element type's zero value after any
previously sent values have been received.
So in this function
func errorChannelWatch(ch chan error, list []error) {
for {
select {
case err := <- ch:
list = append(list, err)
}
}
}
after ch gets closed this turns into an infinite loop adding nil values to list.
Try this instead:
func errorChannelWatch(ch chan error, list []error) {
for err := range ch {
list = append(list, err)
}
}
2. For the second point, why you don't see anything in your error list:
The problem is this call:
errorChannel := make(chan error)
errorList := []error{}
go errorChannelWatch(errorChannel, errorList)
Here you hand errorChannelWatch the errorList as a value. So the slice errorList will not be changed by the function. What is changed, is the underlying array, as long as the append calls don't need to allocate a new one.
To remedy the situation, either hand a slice pointer to errorChannelWatch or rewrite it as a call to a closure, capturing
errorList.
For the first proposed solution, change errorChannelWatch to
func errorChannelWatch(ch chan error, list *[]error) {
for err := range ch {
*list = append(*list, err)
}
}
and the call to
errorChannel := make(chan error)
errorList := []error{}
go errorChannelWatch(errorChannel, &errorList)
For the second proposed solution, just change the call to
errorChannel := make(chan error)
errorList := []error{}
go func() {
for err := range errorChannel {
errorList = append(errorList, err)
}
} ()
3. A minor remark:
One could think, that there is a synchronisation problem here:
fileWaitGroup.Wait()
close(errorChannel)
log.Println(errorList)
How can you be sure, that errorList isn't modified, after the call to close? One could reason, that you can't know, how many values the goroutine errorChannelWatch still has to process.
Your synchronisation seems correct to me, as you do the wg.Done()
after the send to the error channel and so all error values will
be sent, when fileWaitGroup.Wait() returns.
But that can change, if someone later adds a buffering to the error
channel or alters the code.
So I would advise to at least explain the synchronisation in a comment.
I'm trying to get the logs from multiple docker containers at once (order doesn't matter). This works as expected if types.ContainerLogsOption.Follow is set to false.
If types.ContainerLogsOption.Follow is set to true sometimes the log output get stuck after a few logs and no follow up logs are printed to stdout.
If the output doesn't get stuck it works as expected.
Additionally if I restart one or all of the containers the command doesn't exit like docker logs -f containerName does.
func (w *Whatever) Logs(options LogOptions) {
readers := []io.Reader{}
for _, container := range options.Containers {
responseBody, err := w.Docker.Client.ContainerLogs(context.Background(), container, types.ContainerLogsOptions{
ShowStdout: true,
ShowStderr: true,
Follow: options.Follow,
})
defer responseBody.Close()
if err != nil {
log.Fatal(err)
}
readers = append(readers, responseBody)
}
// concatenate all readers to one
multiReader := io.MultiReader(readers...)
_, err := stdcopy.StdCopy(os.Stdout, os.Stderr, multiReader)
if err != nil && err != io.EOF {
log.Fatal(err)
}
}
Basically there is no great difference in my implementation from that of docker logs https://github.com/docker/docker/blob/master/cli/command/container/logs.go, hence I'm wondering what causes this issues.
As JimB commented, that method won't work due to the operation of io.MultiReader. What you need to do is read from each from each response individually and combine the output. Since you're dealing with logs, it would make sense to break up the reads on newlines. bufio.Scanner does this for a single io.Reader. So one option would be to create a new type that scans multiple readers concurrently.
You could use it like this:
scanner := NewConcurrentScanner(readers...)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
if err := scanner.Err(); err != nil {
log.Fatalln(err)
}
Example implementation of a concurrent scanner:
// ConcurrentScanner works like io.Scanner, but with multiple io.Readers
type ConcurrentScanner struct {
scans chan []byte // Scanned data from readers
errors chan error // Errors from readers
done chan struct{} // Signal that all readers have completed
cancel func() // Cancel all readers (stop on first error)
data []byte // Last scanned value
err error
}
// NewConcurrentScanner starts scanning each reader in a separate goroutine
// and returns a *ConcurrentScanner.
func NewConcurrentScanner(readers ...io.Reader) *ConcurrentScanner {
ctx, cancel := context.WithCancel(context.Background())
s := &ConcurrentScanner{
scans: make(chan []byte),
errors: make(chan error),
done: make(chan struct{}),
cancel: cancel,
}
var wg sync.WaitGroup
wg.Add(len(readers))
for _, reader := range readers {
// Start a scanner for each reader in it's own goroutine.
go func(reader io.Reader) {
defer wg.Done()
scanner := bufio.NewScanner(reader)
for scanner.Scan() {
select {
case s.scans <- scanner.Bytes():
// While there is data, send it to s.scans,
// this will block until Scan() is called.
case <-ctx.Done():
// This fires when context is cancelled,
// indicating that we should exit now.
return
}
}
if err := scanner.Err(); err != nil {
select {
case s.errors <- err:
// Reprort we got an error
case <-ctx.Done():
// Exit now if context was cancelled, otherwise sending
// the error and this goroutine will never exit.
return
}
}
}(reader)
}
go func() {
// Signal that all scanners have completed
wg.Wait()
close(s.done)
}()
return s
}
func (s *ConcurrentScanner) Scan() bool {
select {
case s.data = <-s.scans:
// Got data from a scanner
return true
case <-s.done:
// All scanners are done, nothing to do.
case s.err = <-s.errors:
// One of the scanners error'd, were done.
}
s.cancel() // Cancel context regardless of how we exited.
return false
}
func (s *ConcurrentScanner) Bytes() []byte {
return s.data
}
func (s *ConcurrentScanner) Text() string {
return string(s.data)
}
func (s *ConcurrentScanner) Err() error {
return s.err
}
Here's an example of it working in the Go Playground: https://play.golang.org/p/EUB0K2V7iT
You can see that the concurrent scanner output is interleaved. Rather than reading all of one reader, then moving on to the next, as is seen with io.MultiReader.