golang making concurrent request and combine respone - go

I have written go code to call multiple http request independently and combine the results.
Sometime values are missing on combined method.
func profile(req *http.Request) (UserMe, error, UserRating, error) {
wgcall := &sync.WaitGroup{}
uChan := make(chan ResUser)
rChan := make(chan ResRating)
// variable inits
var meResp UserMe
var ratingResp UserRating
go func() {
res := <-uChan
meResp = res.Value
}()
go func() {
res := <-rChan
ratingResp = res.Value
}()
wgcall.Add(2)
go me(req, wgcall, uChan)
go rate(req, wgcall, rChan)
wgcall.Wait()
logrus.Info(meResp) // sometimes missing
logrus.Info(ratingResp) // sometimes missing
return meResp, meErr, ratingResp, ratingErr
}
But the me and rating calls returns the values from api requests as expected.
func me(req *http.Request, wg *sync.WaitGroup, ch chan ResUser) {
defer wg.Done()
// http call return value correclty
me := ...
ch <- ResUser{
Value := // value from rest
}
logrus.Info(fmt.Sprintf("User calls %v" , me)) // always return the values
close(ch)
}
func rate(req *http.Request, wg *sync.WaitGroup, ch chan ResRating) {
defer wg.Done()
// make http call
rating := ...
ch <- ResRating{
Value := // value from rest
}
logrus.Info(fmt.Sprintf("Ratings calls %v" , rating)) // always return the values
close(ch)
}
Issue is: meResp and ratingResp on profile function not getting the values always. sometime only meResp or ratingResp, sometimes both as expected.
But me and rate function calls getting values always.
Can help me fix this plz ?

There's a race condition in your code.
There is no barrier to ensure the goroutines in the profile method which read from uChan and rChan have populated the variables meResp and ratingResp before you return from profile.
You can simplify your code immensely by dropping the use of channels and the inline declared goroutines in profile. Instead, simply populate the response values directly. There is no benefit provided by using channels or goroutines to read from them in this circumstance as you only intend to send one value and you have a requirement that the values produced by both HTTP calls are present before returning.
You might do this by modifying the signature of me and rate to receive a pointer to the location to write their output, or by wrapping their calls with a small function which receives their output value and populates the value in profile. Importantly, the WaitGroup should only be signalled after the value has been populated:
wgcall := &sync.WaitGroup{}
var meResp UserMe
var ratingResp RatingMe
wgcall.Add(2)
// The "me" and "rate" functions should be refactored to
// drop the wait group and channel arguments.
go func() {
meResp = me(req)
wgcall.Done()
}()
go func() {
ratingResp = rate(req)
wgcall.Done()
}()
wgcall.Wait()
// You are guaranteed that if "me" and "rate" returned valid values,
// they are populated in "meResp" and "ratingResp" at this point.
// Do whatever you need here, such as logging or returning.

Related

How to ensure goroutines launched within goroutines are synchronized with each other?

This is the first time I'm using the concurrency features of Go and I'm jumping right into the deep end.
I want to make concurrent calls to an API. The request is based off of the tags of the posts I want to receive back (there can be 1 <= N tags). The response body looks like this:
{
"posts": [
{
"id": 1,
"author": "Name",
"authorId": 1,
"likes": num_likes,
"popularity": popularity_decimal,
"reads": num_reads,
"tags": [ "tag1", "tag2" ]
},
...
]
}
My plan is to daisy-chain a bunch of channels together and spawn a number of goroutines that read and or write from those channels:
- for each tag, add it to a tagsChannel inside a goroutine
- use that tagsChannel inside another goroutine to make concurrent GET requests to the endpoint
- for each response of that request, pass the underlying slice of posts into another goroutine
- for each individual post inside the slice of posts, add the post to a postChannel
- inside another goroutine, iterate over postChannel and insert each post into a data structure
Here's what I have so far:
func (srv *server) Get() {
// Using red-black tree prevents any duplicates, fast insertion
// and retrieval times, and is sorted already on ID.
rbt := tree.NewWithIntComparator()
// concurrent approach
tagChan := make(chan string) // tags -> tagChan
postChan := make(chan models.Post) // tagChan -> GET -> post -> postChan
errChan := make(chan error) // for synchronizing errors across goroutines
wg := &sync.WaitGroup{} // for synchronizing goroutines
wg.Add(4)
// create a go func to synchronize our wait groups
// once all goroutines are finished, we can close our errChan
go func() {
wg.Wait()
close(errChan)
}()
go insertTags(tags, tagChan, wg)
go fetch(postChan, tagChan, errChan, wg)
go addPostToTree(rbt, postChan, wg)
for err := range errChan {
if err != nil {
srv.HandleError(err, http.StatusInternalServerError).ServeHTTP(w, r)
}
}
}
// insertTags inserts user's passed-in tags to tagChan
// so that tagChan may pass those along in fetch.
func insertTags(tags []string, tagChan chan<- string, group *sync.WaitGroup) {
defer group.Done()
for _, tag := range tags {
tagChan <- tag
}
close(tagChan)
}
// fetch completes a GET request to the endpoint
func fetch(posts chan<- models.Post, tags <-chan string, errs chan<- error, group *sync.WaitGroup) {
defer group.Done()
for tag := range tags {
ep, err := formURL(tag)
if err != nil {
errs <- err
}
group.Add(1) // QUESTION should I use a separate wait group here?
go func() {
resp, err := http.Get(ep.String())
if err != nil {
errs <- err
}
container := models.PostContainer{}
err = json.NewDecoder(resp.Body).Decode(&container)
defer resp.Body.Close()
group.Add(1) // QUESTION should I add a separate wait group here and pass it to insertPosts?
go insertPosts(posts, container.Posts, group)
defer group.Done()
}()
// group.Done() -- removed this call due to Burak, but now my program hands
}
}
// insertPosts inserts each individual post into our posts channel so that they may be
// concurrently added to our RBT.
func insertPosts(posts chan<- models.Post, container []models.Post, group *sync.WaitGroup) {
defer group.Done()
for _, post := range container {
posts <- post
}
}
// addPostToTree iterates over the channel and
// inserts each individual post into our RBT,
// setting the post ID as the node's key.
func addPostToTree(tree *tree.RBT, collection <-chan models.Post, group *sync.WaitGroup) {
defer group.Done()
for post := range collection {
// ignore return value & error here:
// we don't care about the returned key and
// error is only ever if a duplicate is attempted to be added -- we don't care
tree.Insert(post.ID, post)
}
}
I'm able to make one request to the endpoint, but as soon as try to submit a second request, my program fails with panic: sync: negative WaitGroup counter.
My question is why exactly is my WaitGroup counter going negative? I make sure to add to the waitgroup and mark when my goroutines are done.
If the waitgroup is negative on the second request, then that must mean that the first time I allocate a waitgroup and add 4 to it is being skipped... why? Does this have something to do with closing channels, maybe? And if so, where do I close a channel?
Also -- does anyone have tips for debugging goroutines?
Thanks for your help.
Firstly, the whole design is quite complicated. Mentioned my thoughts towards the end.
There are 2 problems in your code:
posts channel is never closed, due to which addPostToTree might never be existing the loop, resulting in one waitGroup never decreasing (In your case the program hangs). There is a chance the program waits indefinitely with a deadlock (Thinking other goroutine will release it, but all goroutines are stuck).
Solution: You can close the postChan channel. But how? Its always recommended for the producer to always close the channel, but you've multiple producers. So the best option is, wait for all producers to finish and then close the channel. In order to wait for all producers to finish, you'll need to create another waitGroup and use that to track the child routines.
Code:
// fetch completes a GET request to the endpoint
func fetch(posts chan<- models.Post, tags <-chan string, errs chan<- error, group *sync.WaitGroup) {
postsWG := &sync.WaitGroup{}
for tag := range tags {
ep, err := formURL(tag)
if err != nil {
errs <- err
}
postsWG.Add(1) // QUESTION should I use a separate wait group here?
go func() {
resp, err := http.Get(ep.String())
if err != nil {
errs <- err
}
container := models.PostContainer{}
err = json.NewDecoder(resp.Body).Decode(&container)
defer resp.Body.Close()
go insertPosts(posts, container.Posts, postsWG)
}()
}
defer func() {
postsWG.Wait()
close(posts)
group.Done()
}()
}
Now, we've another issue, the main waitGroup should be initialized with 3 instead of 4. This is because the main routine is only spinning up 3 more routines wg.Add(3), so it has to keep track of only those. For child routines, we're using a different waitGroup, so that is not the headache of the parent anymore.
Code:
errChan := make(chan error) // for synchronizing errors across goroutines
wg := &sync.WaitGroup{} // for synchronizing goroutines
wg.Add(3)
// create a go func to synchronize our wait groups
// once all goroutines are finished, we can close our errChan
TLDR --
Complex Design - As the main wait group is started at one place, but each goroutine is modifying this waitGroup as it feels necessary. So there is no single owner for this, which makes debugging and maintaining super complex (+ can't ensure it'll be bug free).
I would recommend breaking this down and have separate trackers for each child routines. That way, the caller who is spinning up more routines can only concentrate on tracking its child goroutines. This routine will then inform its parent waitGroup only after its done (& its child is done, rather than letting the child routinue informing the grandparent directly).
Also, in fetch method after making the HTTP call and getting the response, why create another goroutine to process this data? Either way this goroutine cannot exit until the data insertion happens, nor is it doing someother action which data processing happens. From what I understand, The second goroutine is redundant.
group.Add(1) // QUESTION should I add a separate wait group here and pass it to insertPosts?
go insertPosts(posts, container.Posts, group)
defer group.Done()

How to parallelize a recursive function

I am trying to parallelize a recursive problem in Go, and I am unsure what the best way to do this is.
I have a recursive function, which works like this:
func recFunc(input string) (result []string) {
for subInput := range getSubInputs(input) {
subOutput := recFunc(subInput)
result = result.append(result, subOutput...)
}
result = result.append(result, getOutput(input)...)
}
func main() {
output := recFunc("some_input")
...
}
So the function calls itself N times (where N is 0 at some level), generates its own output and returns everything in a list.
Now I want to make this function run in parallel. But I am unsure what the cleanest way to do this is. My Idea:
Have a "result" channel, to which all function calls send their result.
Collect the results in the main function.
Have a wait group, which determines when all results are collected.
The Problem: I need to wait for the wait group and collect all results in parallel. I can start a separate go function for this, but how do I ever quit this separate go function?
func recFunc(input string) (result []string, outputChannel chan []string, waitGroup &sync.WaitGroup) {
defer waitGroup.Done()
waitGroup.Add(len(getSubInputs(input))
for subInput := range getSubInputs(input) {
go recFunc(subInput)
}
outputChannel <-getOutput(input)
}
func main() {
outputChannel := make(chan []string)
waitGroup := sync.WaitGroup{}
waitGroup.Add(1)
go recFunc("some_input", outputChannel, &waitGroup)
result := []string{}
go func() {
nextResult := <- outputChannel
result = append(result, nextResult ...)
}
waitGroup.Wait()
}
Maybe there is a better way to do this? Or how can I ensure the anonymous go function, that collects the results, is quited when done?
tl;dr;
recursive algorithms should have bounded limits on expensive resources (network connections, goroutines, stack space etc.)
cancelation should be supported - to ensure expensive operations can be cleaned up quickly if a result is no longer needed
branch traversal should support error reporting; this allows errors to bubble up the stack & partial results to be returned without the entire recursion traversal to fail.
For asychronous results - whether using recursions or not - use of channels is recommended. Also, for long running jobs with many goroutines, provide a method for cancelation (context.Context) to aid with clean-up.
Since recursion can lead to exponential consumption of resources it's important to put limits in place (see bounded parallelism).
Below is a design patten I use a lot for asynchronous tasks:
always support taking a context.Context for cancelation
number of workers needed for the task
return a chan of results & a chan error (will only return one error or nil)
var (
workers = 10
ctx = context.TODO() // use request context here - otherwise context.Background()
input = "abc"
)
resultC, errC := recJob(ctx, workers, input) // returns results & `error` channels
// asynchronous results - so read that channel first in the event of partial results ...
for r := range resultC {
fmt.Println(r)
}
// ... then check for any errors
if err := <-errC; err != nil {
log.Fatal(err)
}
Recursion:
Since recursion quickly scales horizontally, one needs a consistent way to fill the finite list of workers with work but also ensure when workers are freed up, that they quickly pick up work from other (over-worked) workers.
Rather than create a manager layer, employ a cooperative peer system of workers:
each worker shares a single inputs channel
before recursing on inputs (subIinputs) check if any other workers are idle
if so, delegate to that worker
if not, current worker continues recursing that branch
With this algorithm, the finite count of workers quickly become saturated with work. Any workers which finish early with their branch - will quickly be delegated a sub-branch from another worker. Eventually all workers will run out of sub-branches, at which point all workers will be idled (blocked) and the recursion task can finish up.
Some careful coordination is needed to achieve this. Allowing the workers to write to the input channel helps with this peer coordination via delegation. A "recursion depth" WaitGroup is used to track when all branches have been exhausted across all workers.
(To include context support and error chaining - I updated your getSubInputs function to take a ctx and return an optional error):
func recFunc(ctx context.Context, input string, in chan string, out chan<- string, rwg *sync.WaitGroup) error {
defer rwg.Done() // decrement recursion count when a depth of recursion has completed
subInputs, err := getSubInputs(ctx, input)
if err != nil {
return err
}
for subInput := range subInputs {
rwg.Add(1) // about to recurse (or delegate recursion)
select {
case in <- subInput:
// delegated - to another goroutine
case <-ctx.Done():
// context canceled...
// but first we need to undo the earlier `rwg.Add(1)`
// as this work item was never delegated or handled by this worker
rwg.Done()
return ctx.Err()
default:
// noone available to delegate - so this worker will need to recurse this item themselves
err = recFunc(ctx, subInput, in, out, rwg)
if err != nil {
return err
}
}
select {
case <-ctx.Done():
// always check context when doing anything potentially blocking (in this case writing to `out`)
// context canceled
return ctx.Err()
case out <- subInput:
}
}
return nil
}
Connecting the Pieces:
recJob creates:
input & output channels - shared by all workers
"recursion" WaitGroup detects when all workers are idle
"output" channel can then safely be closed
error channel for all workers
kicks-off recursion workload by writing initial input to input channel
func recJob(ctx context.Context, workers int, input string) (resultsC <-chan string, errC <-chan error) {
// RW channels
out := make(chan string)
eC := make(chan error, 1)
// R-only channels returned to caller
resultsC, errC = out, eC
// create workers + waitgroup logic
go func() {
var err error // error that will be returned to call via error channel
defer func() {
close(out)
eC <- err
close(eC)
}()
var wg sync.WaitGroup
wg.Add(1)
in := make(chan string) // input channel: shared by all workers (to read from and also to write to when they need to delegate)
workerErrC := createWorkers(ctx, workers, in, out, &wg)
// get the ball rolling, pass input job to one of the workers
// Note: must be done *after* workers are created - otherwise deadlock
in <- input
errCount := 0
// wait for all worker error codes to return
for err2 := range workerErrC {
if err2 != nil {
log.Println("worker error:", err2)
errCount++
}
}
// all workers have completed
if errCount > 0 {
err = fmt.Errorf("PARTIAL RESULT: %d of %d workers encountered errors", errCount, workers)
return
}
log.Printf("All %d workers have FINISHED\n", workers)
}()
return
}
Finally, create the workers:
func createWorkers(ctx context.Context, workers int, in chan string, out chan<- string, rwg *sync.WaitGroup) (errC <-chan error) {
eC := make(chan error) // RW-version
errC = eC // RO-version (returned to caller)
// track the completeness of the workers - so we know when to wrap up
var wg sync.WaitGroup
wg.Add(workers)
for i := 0; i < workers; i++ {
i := i
go func() {
defer wg.Done()
var err error
// ensure the current worker's return code gets returned
// via the common workers' error-channel
defer func() {
if err != nil {
log.Printf("worker #%3d ERRORED: %s\n", i+1, err)
} else {
log.Printf("worker #%3d FINISHED.\n", i+1)
}
eC <- err
}()
log.Printf("worker #%3d STARTED successfully\n", i+1)
// worker scans for input
for input := range in {
err = recFunc(ctx, input, in, out, rwg)
if err != nil {
log.Printf("worker #%3d recurseManagers ERROR: %s\n", i+1, err)
return
}
}
}()
}
go func() {
rwg.Wait() // wait for all recursion to finish
close(in) // safe to close input channel as all workers are blocked (i.e. no new inputs)
wg.Wait() // now wait for all workers to return
close(eC) // finally, signal to caller we're truly done by closing workers' error-channel
}()
return
}
I can start a separate go function for this, but how do I ever quit this separate go function?
You can range over the output channel in the separate go-routine. The go-routine, in that case, will exit safely, when the channel is closed
go func() {
for nextResult := range outputChannel {
result = append(result, nextResult ...)
}
}
So, now the thing that we need to take care of is that the channel is closed after all the go-routines spawned as part of the recursive function call have successfully existed
For that, you can use a shared waitgroup across all the go-routines and wait on that waitgroup in your main function, as you are already doing. Once the wait is over, close the outputChannel, so that the other go-routine also exits safely
func recFunc(input string, outputChannel chan, wg &sync.WaitGroup) {
defer wg.Done()
for subInput := range getSubInputs(input) {
wg.Add(1)
go recFunc(subInput)
}
outputChannel <-getOutput(input)
}
func main() {
outputChannel := make(chan []string)
waitGroup := sync.WaitGroup{}
waitGroup.Add(1)
go recFunc("some_input", outputChannel, &waitGroup)
result := []string{}
go func() {
for nextResult := range outputChannel {
result = append(result, nextResult ...)
}
}
waitGroup.Wait()
close(outputChannel)
}
PS: If you want to have bounded parallelism to limit the exponential growth, check this out

Recommended way of closing redundant sql.Rows object after Go routine

I'm using Go routines to send queries to PostgreSQL master and slave nodes in parallel. The first host that returns a valid result wins. Error cases are outside the scope of this question.
The caller is the only one that cares about the contents of a *sql.Rows object, so intentionally my function doesn't do any operations on those. I use buffered channels to retrieve return objects from the Go routines, so there should be no Go routine leak. Garbage collection should take care of the rest.
There is a problem I haven't taught about properly: the Rows objects that remain behind in the channel are never closed. When I call this function from a (read only) transaction, tx.Rollback() returns an error for every instance of non-closed Rows object: "unexpected command tag SELECT".
This function is called from higher level objects:
func multiQuery(ctx context.Context, xs []executor, query string, args ...interface{}) (*sql.Rows, error) {
rc := make(chan *sql.Rows, len(xs))
ec := make(chan error, len(xs))
for _, x := range xs {
go func(x executor) {
rows, err := x.QueryContext(ctx, query, args...)
switch { // Make sure only one of them is returned
case err != nil:
ec <- err
case rows != nil:
rc <- rows
}
}(x)
}
var me MultiError
for i := 0; i < len(xs); i++ {
select {
case err := <-ec:
me.append(err)
case rows := <-rc: // Return on the first success
return rows, nil
}
}
return nil, me.check()
}
Executors can be *sql.DB, *sql.Tx or anything that complies with the interface:
type executor interface {
ExecContext(ctx context.Context, query string, args ...interface{}) (sql.Result, error)
QueryContext(ctx context.Context, query string, args ...interface{}) (*sql.Rows, error)
QueryRowContext(ctx context.Context, query string, args ...interface{}) *sql.Row
}
Rollback logic:
func (mtx MultiTx) Rollback() error {
ec := make(chan error, len(mtx))
for _, tx := range mtx {
go func(tx *Tx) {
err := tx.Rollback()
ec <- err
}(tx)
}
var me MultiError
for i := 0; i < len(mtx); i++ {
if err := <-ec; err != nil {
me.append(err)
}
}
return me.check()
}
MultiTx is a collection of open transactions on multiple nodes. It is a higher level object that calls multiQuery
What would be the best approach to "clean up" unused rows? Options I'm thinking about not doing:
Cancel the context: I believe it will work inconsistently, multiple queries might already have returned by the time cancel() is called
Create a deferred Go routine which continues to drain the channels and close the rows objects: If a DB node is slow to respond, Rollback() is still called before rows.Close()
Use a sync.WaitGroup somewhere in the MultiTx type, maybe in combination with (2): This can cause Rollback to hang if one of the nodes is unresponsive. Also, I wouldn't be sure how I would implement that.
Ignore the Rollback errors: Ignoring errors never sounds like a good idea, they are there for a reason.
What would be the recommended way of approaching this?
Edit:
As suggested by #Peter, I've tried canceling the context, but it seems this also invalidates all the returned Rows from the query. On rows.Scan I'm getting context canceled error at the higher level caller.
This is what I've done so far:
func multiQuery(ctx context.Context, xs []executor, query string, args ...interface{}) (*sql.Rows, error) {
ctx, cancel := context.WithCancel(ctx)
defer cancel()
rc := make(chan *sql.Rows, len(xs))
ec := make(chan error, len(xs))
for _, x := range xs {
go func(x executor) {
rows, err := x.QueryContext(ctx, query, args...)
switch { // Make sure only one of them is returned
case err != nil:
ec <- err
case rows != nil:
rc <- rows
cancel() // Cancel on success
}
}(x)
}
var (
me MultiError
rows *sql.Rows
)
for i := 0; i < len(xs); i++ {
select {
case err := <-ec:
me.append(err)
case r := <-rc:
if rows == nil { // Only use the first rows
rows = r
} else {
r.Close() // Cleanup remaining rows, if there are any
}
}
}
if rows != nil {
return rows, nil
}
return nil, me.check()
}
Edit 2:
#Adrian mentioned:
we can't see the code that's actually using any of this.
This code is reused by type methods. First there is the transaction type. The issues in this question are appearing on the Rollback() method above.
// MultiTx holds a slice of open transactions to multiple nodes.
// All methods on this type run their sql.Tx variant in one Go routine per Node.
type MultiTx []*Tx
// QueryContext runs sql.Tx.QueryContext on the tranactions in separate Go routines.
// The first non-error result is returned immediately
// and errors from the other Nodes will be ignored.
//
// If all nodes respond with the same error, that exact error is returned as-is.
// If there is a variety of errors, they will be embedded in a MultiError return.
//
// Implements boil.ContextExecutor.
func (mtx MultiTx) QueryContext(ctx context.Context, query string, args ...interface{}) (*sql.Rows, error) {
return multiQuery(ctx, mtx2Exec(mtx), query, args...)
}
Then there is:
// MultiNode holds a slice of Nodes.
// All methods on this type run their sql.DB variant in one Go routine per Node.
type MultiNode []*Node
// QueryContext runs sql.DB.QueryContext on the Nodes in separate Go routines.
// The first non-error result is returned immediately
// and errors from the other Nodes will be ignored.
//
// If all nodes respond with the same error, that exact error is returned as-is.
// If there is a variety of errors, they will be embedded in a MultiError return.
//
// Implements boil.ContextExecutor.
func (mn MultiNode) QueryContext(ctx context.Context, query string, args ...interface{}) (*sql.Rows, error) {
return multiQuery(ctx, nodes2Exec(mn), query, args...)
}
These methods the public wrappers around the multiQuery() function. Now I realize that just sending the *Rows into a buffered channel to die, is actually a memory leak. In the transaction cases it becomes clear, as Rollback() starts to complain. But in the non-transaction variant, the *Rows inside the channel will never be garbage collected, as the driver might hold reference to it until rows.Close() is called.
I've written this package to by used by an ORM, sqlboiler. My higher level logic passes a MultiTX object to the ORM. From that point, I don't have any explicit control over the returned Rows. A simplistic approach would be that my higher level code cancels the context before Rollback(), but I don't like that:
It gives a non-intuitive API. This (idiomatic) approach would break:
ctx, cancel = context.WithCancel(context.Background())
defer cancel()
tx, _ := db.BeginTx(ctx)
defer tx.Rollback()
The ORM's interfaces also specify the regular, non-context aware Query() variants, which in my package's case will run against context.Background().
I'm starting to worry that this broken by design... Anyway, I will start by implementing a Go routine that will drain the channel and close the *Rows. After that I will see if I can implement some reasonable waiting / cancellation mechanism that won't affect the returned *Rows
I think that the function below will do what you require with the one provisio being that the context passed in should be cancelled when you are done with the results (otherwise one context.WithCancel will leak; I cannot see a way around that as cancelling it within the function will invalidate the returned sql.Rows).
Note that I have not had time to test this (would need to setup a database, implement your interfaces etc) so there may well be a bug hidden in the code (but I believe the basic algorithm is sound)
// queryResult holds the goroutine# and the result from that gorouting (need both so we can avoid cancelling the relevant context)
type queryResult struct {
no int
rows *sql.Rows
}
// multiQuery - Executes multiple queries and returns either the first to resutn a result or, if all fail, a multierror summarising the errors
// Important: This should be used for READ ONLY queries only (it is possible that more than one will complete)
// Note: The ctx passed in must be cancelled to avoid leaking a context (this routine cannot cancel the context used for the winning query)
func multiQuery(ctx context.Context, xs []executor, query string, args ...interface{}) (*sql.Rows, error) {
noOfQueries := len(xs)
rc := make(chan queryResult) // Channel for results; unbuffered because we only want one, and only one, result
ec := make(chan error) // errors get sent here - goroutines must send a result or 1 error
defer close(ec) // Ensure the error consolidation go routine will complete
// We need a way to cancel individual goroutines as we do not know which one will succeed
cancelFns := make([]context.CancelFunc, noOfQueries)
// All goroutines must terminate before we exit (otherwise the transaction maybe rolled back before they are cancelled leading to "unexpected command tag SELECT")
var wg sync.WaitGroup
wg.Add(noOfQueries)
for i, x := range xs {
var queryCtx context.Context
queryCtx, cancelFns[i] = context.WithCancel(ctx)
go func(ctx context.Context, queryNo int, x executor) {
defer wg.Done()
rows, err := x.QueryContext(ctx, query, args...)
if err != nil {
ec <- err // Error collection go routine guaranteed to run until all query goroutines complete
return
}
select {
case rc <- queryResult{queryNo, rows}:
return
case <-ctx.Done(): // If another query has already transmitted its results these should be thrown away
rows.Close() // not strictly required because closed context should tidy up
return
}
}(queryCtx, i, x)
}
// Start go routine that will send a MultiError to a channel if all queries fail
mec := make(chan MultiError)
go func() {
var me MultiError
errCount := 0
for err := range ec {
me.append(err)
errCount += 1
if errCount == noOfQueries {
mec <- me
return
}
}
}()
// Wait for one query to succeed or all queries to fail
select {
case me := <-mec:
for _, cancelFn := range cancelFns { // not strictly required so long as ctx is eventually cancelled
cancelFn()
}
wg.Wait()
return nil, me.check()
case result := <-rc:
for i, cancelFn := range cancelFns { // not strictly required so long as ctx is eventually cancelled
if i != result.no { // do not cancel the query that returned a result
cancelFn()
}
}
wg.Wait()
return result.rows, nil
}
}
Thanks to the comments from #Peter and the answer of #Brits, I got fresh ideas on how to approach this.
Blue print
3 out of 4 proposals from the question were needed to be implemented.
1. Cancel the Context
mtx.QueryContext() creates a descendant context and sets the CancelFunc in the MultiTx object.
The cancelWait() helper cancels an old context and waits for MultiTX.Done if its not nil. It is called on Rollback() and before every new query.
2. Drain the channel
In multiQuery(), Upon obtaining the first successful Rows, a Go routine is launched to drain and close the remaining Rows. The rows channel no longer needs to be buffered.
An additional Go routine and a WaitGroup is used to close the error and rows channels.
3. Return a done channel
Instead of the proposed WaitGroup, multiQuery() returns a done channel. The channel is closed once the drain & close routine has finished. mtx.QueryContext() sets done the channel on the MultiTx object.
Errors
Instead of the select block, only drain the error channel if there are now Rows. The error needs to remain buffered for this reason.
Code
// MultiTx holds a slice of open transactions to multiple nodes.
// All methods on this type run their sql.Tx variant in one Go routine per Node.
type MultiTx struct {
tx []*Tx
done chan struct{}
cancels context.CancelFunc
}
func (m *MultiTx) cancelWait() {
if m.cancel != nil {
m.cancel()
}
if m.done != nil {
<-m.done
}
// reset
m.done, m.cancel = nil, nil
}
// Context creates a child context and appends CancelFunc in MultiTx
func (m *MultiTx) context(ctx context.Context) context.Context {
m.cancelWait()
ctx, m.cancel = context.WithCancel(ctx)
return ctx
}
// QueryContext runs sql.Tx.QueryContext on the tranactions in separate Go routines.
func (m *MultiTx) QueryContext(ctx context.Context, query string, args ...interface{}) (rows *sql.Rows, err error) {
rows, m.done, err = multiQuery(m.context(ctx), mtx2Exec(m.tx), query, args...)
return rows, err
}
func (m *MultiTx) Rollback() error {
m.cancelWait()
ec := make(chan error, len(m.tx))
for _, tx := range m.tx {
go func(tx *Tx) {
err := tx.Rollback()
ec <- err
}(tx)
}
var me MultiError
for i := 0; i < len(m.tx); i++ {
if err := <-ec; err != nil {
me.append(err)
}
}
return me.check()
}
func multiQuery(ctx context.Context, xs []executor, query string, args ...interface{}) (*sql.Rows, chan struct{}, error) {
rc := make(chan *sql.Rows)
ec := make(chan error, len(xs))
var wg sync.WaitGroup
wg.Add(len(xs))
for _, x := range xs {
go func(x executor) {
rows, err := x.QueryContext(ctx, query, args...)
switch { // Make sure only one of them is returned
case err != nil:
ec <- err
case rows != nil:
rc <- rows
}
wg.Done()
}(x)
}
// Close channels when all query routines completed
go func() {
wg.Wait()
close(ec)
close(rc)
}()
rows, ok := <-rc
if ok { // ok will be false if channel closed before any rows
done := make(chan struct{}) // Done signals the caller that all remaining rows are properly closed
go func() {
for rows := range rc { // Drain channel and close unused Rows
rows.Close()
}
close(done)
}()
return rows, done, nil
}
// no rows, build error return
var me MultiError
for err := range ec {
me.append(err)
}
return nil, nil, me.check()
}
Edit: Cancel & wait for old contexts before every Query, as *sql.Tx is not Go routine save, all previous queries have to be done before a next call.

Golang Concurrency Issue

I am learning Golang concurrency and have written a program to display URL's in order. I expect the code to return
http://bing.com*
http://google.com*
But it always returns http:/google.com*** . As if the variable is being overwritten.Since i am using goroutines i would expect it to return both values at the sametime.
func check(u string) string {
tmpres := u+"*****"
return tmpres
}
func IsReachable(url string) string {
ch := make(chan string, 1)
go func() {
ch <- check(url)
}()
select {
case reachable := <-ch:
// use err and reply
return reachable
case <-time.After(3* time.Second):
// call timed out
return "none"
}
}
func main() {
var urls = []string{
"http://bing.com/",
"http://google.com/",
}
for _, url := range urls {
go func() {
fmt.Println(IsReachable(url))
}()
}
time.Sleep(1 * time.Second)
}
Two problems. First, you've created a race condition. By closing over the loop variable, you're sharing it between the thread running the loop and the thread running the goroutine, which is causing your described problem: by the time the goroutine that was started for the first URL tries to run, the value of the variable has changed. You need to either copy it to a local variable, or pass it as an argument, e.g.:
for _, url := range urls {
go func(url string) {
fmt.Println(IsReachable(url))
}(url)
}
Second, you said you wanted to display them "in order", which is not a goal generally compatible with concurrency/parallism, because you cannot control the order of parallel operations. If you want them in order, you should do them in order in a single thread. Otherwise, you'll have to collect the results, wait for all them to come back, then sort the results back into the desired order before printing them.

Goroutine implementation doubts

I need some help on understanding how to use goroutines in this problem. I will post only some snippets of code but if you want to take a deep look you can check it out here
Basically, I have a distributor function which receives a request slice being called many times, and each time the function is called it must distribute this request among other functions to actually resolve the request. And what I'm trying to create a channel and launch this function to resolve the request on a new goroutine, so the program can handle requests concurrently.
How the distribute function is called:
// Run trigger the system to start receiving requests
func Run() {
// Since the programs starts here, let's make a channel to receive requests
requestCh := make(chan []string)
idCh := make(chan string)
// If you want to play with us you need to register your Sender here
go publisher.Sender(requestCh)
go makeID(idCh)
// Our request pool
for request := range requestCh {
// add ID
request = append(request, <-idCh)
// distribute
distributor(request)
}
// PROBLEM
for result := range resultCh {
fmt.Println(result)
}
}
The distribute function itself:
// Distribute requests to respective channels.
// No waiting in line. Everybody gets its own goroutine!
func distributor(request []string) {
switch request[0] {
case "sum":
arithCh := make(chan []string)
go arithmetic.Exec(arithCh, resultCh)
arithCh <- request
case "sub":
arithCh := make(chan []string)
go arithmetic.Exec(arithCh, resultCh)
arithCh <- request
case "mult":
arithCh := make(chan []string)
go arithmetic.Exec(arithCh, resultCh)
arithCh <- request
case "div":
arithCh := make(chan []string)
go arithmetic.Exec(arithCh, resultCh)
arithCh <- request
case "fibonacci":
fibCh := make(chan []string)
go fibonacci.Exec(fibCh, resultCh)
fibCh <- request
case "reverse":
revCh := make(chan []string)
go reverse.Exec(revCh, resultCh)
revCh <- request
case "encode":
encCh := make(chan []string)
go encode.Exec(encCh, resultCh)
encCh <- request
}
}
And the fibonacci.Exec function to illustrate how I'm trying to calculate the Fibonacci given a request received on the fibCh and sending the result value through the resultCh.
func Exec(fibCh chan []string, result chan map[string]string) {
fib := parse(<-fibCh)
nthFibonacci(fib)
result <- fib
}
So far, at the Run function when I range over the resultCh I get the results but also a deadlock. But why? Also, I imagine that I should use the waitGroup function to wait the goroutines to finish but I'm not sure of how implement that since I'm expecting receive a continuous stream of requests. I would appreciate some help on understanding what I'm doing wrong here and a way to solve it.
I'm not digging into the implementation details of your application, but basically as it sounds to me, you can use the workers pattern.
Using the workers pattern multiple goroutines can read from a single channel, distributing an amount of work between CPU cores, hence the workers name. In Go, this pattern is easy to implement - just start a number of goroutines with channel as parameter, and just send values to that channel - distributing and multiplexing will be done by Go runtime, automagically.
Here is a simple implementation of the workers pattern:
package main
import (
"fmt"
"sync"
"time"
)
func worker(tasksCh <-chan int, wg *sync.WaitGroup) {
defer wg.Done()
for {
task, ok := <-tasksCh
if !ok {
return
}
d := time.Duration(task) * time.Millisecond
time.Sleep(d)
fmt.Println("processing task", task)
}
}
func pool(wg *sync.WaitGroup, workers, tasks int) {
tasksCh := make(chan int)
for i := 0; i < workers; i++ {
go worker(tasksCh, wg)
}
for i := 0; i < tasks; i++ {
tasksCh <- i
}
close(tasksCh)
}
func main() {
var wg sync.WaitGroup
wg.Add(36)
go pool(&wg, 36, 50)
wg.Wait()
}
Another useful resource how you can use the WaitGroup to wait for all the goroutines to finish the execution before to continue (hence to not trap into deadlock) is this nice article:
http://nathanleclaire.com/blog/2014/02/15/how-to-wait-for-all-goroutines-to-finish-executing-before-continuing/
And a very basic implementation of it:
Go playground
If you do not want to change the implementation to use the worker pattern maybe would be a good idea to use another channel to signify the end of goroutine execution, because deadlock happens when there is no receiver to accept the sent message through unbuffered channel.
done := make(chan bool)
//.....
done <- true //Tell the main function everything is done.
So when you receive the message you mark the execution as completed by setting the channel value to true.

Resources