Benefits of actor pattern in HTTP handler - go

I've been reading a few go blogs and and more recently I stubbled upon Peter Bourgon's talk titled "Ways to do things". He shows a few examples of the actor pattern for concurrency in GO. Here is a handler example using such pattern:
func (a *API) handleNext(w http.ResponseWriter, r *http.Request) {
var (
notFound = make(chan struct{})
otherError = make(chan error)
nextID = make(chan string)
)
a.action <- func() {
s, err := a.log.Oldest()
if err == ErrNoSegmentsAvailable {
close(notFound)
return
}
if err != nil {
otherError <- err
return
}
id := uuid.New()
a.pending[id] = pendingSegment{s, time.Now().Add(a.timeout), false}
nextID <- id
}
select {
case <-notFound:
http.NotFound(w, r)
case err := <-otherError:
http.Error(w, err.Error(), http.StatusInternalServerError)
case id := <-nextID:
fmt.Fprint(w, id)
}
}
And there's a loop behind the scenes listening for the action channel:
func (a *API) loop() {
for {
select {
case f := <-a.action:
f()
}
}
}
My question is what is the benefit to all of this? The handler isn't any faster because it is still blocking until some action in the action func returns something to it. Which is essentially the same thing as just returning the function from outside the go routine. What am I missing here?

The benefits are not to a single call but to the sum of all calls.
For example you can use this to limit actual execution to a single goroutine and thereby avoid all the problems concurrent execution would bring with it.
For example I use this pattern to synchronise all usage of a connection to a hardware device that talks serial.

Related

Golang Concurrency Issue to introduce timeout

I wish to implement parallel api calling in golang using go routines. Once the requests are fired,
I need to wait for all responses (which take different time).
If any of the request fails and returns an error, I wish to end (or pretend) the routines.
I also want to have a timeout value associated with each go routine (or api call).
I have implemented the below for 1 and 2, but need help as to how can I implement 3. Also, feedback on 1 and 2 will also help.
package main
import (
"errors"
"fmt"
"sync"
"time"
)
func main() {
var wg sync.WaitGroup
c := make(chan interface{}, 1)
c2 := make(chan interface{}, 1)
err := make(chan interface{})
wg.Add(1)
go func() {
defer wg.Done()
result, e := doSomeWork()
if e != nil {
err <- e
return
}
c <- result
}()
wg.Add(1)
go func() {
defer wg.Done()
result2, e := doSomeWork2()
if e != nil {
err <- e
return
}
c2 <- result2
}()
go func() {
wg.Wait()
close(c)
close(c2)
close(err)
}()
for e := range err {
// here error happend u could exit your caller function
fmt.Println("Error==>", e)
return
}
fmt.Println(<-c, <-c2)
}
// mimic api call 1
func doSomeWork() (function1, error) {
time.Sleep(10 * time.Second)
obj := function1{"ABC", "29"}
return obj, nil
}
type function1 struct {
Name string
Age string
}
// mimic api call 2
func doSomeWork2() (function2, error) {
time.Sleep(4 * time.Second)
r := errors.New("Error Occured")
if 1 == 2 {
fmt.Println(r)
}
obj := function2{"Delhi", "Delhi"}
// return error as nil for now
return obj, nil
}
type function2 struct {
City string
State string
}
Thanks in advance.
This kind of fork-and-join pattern is exactly what golang.org/x/sync/errgroup was designed for. (Identifying the appropriate “first error” from a group of goroutines can be surprisingly subtle.)
You can use errgroup.WithContext to obtain a context.Context that is cancelled if any of the goroutines in the group returns. The (*Group).Wait method waits for the goroutines to complete and returns the first error.
For your example, that might look something like: https://play.golang.org/p/jqYeb4chHCZ.
You can then inject a timeout within any given call by wrapping the Context using context.WithTimeout.
(However, in my experience if you've plumbed in cancellation correctly, explicit timeouts are almost never helpful — the end user can cancel explicitly if they get tired of waiting, and you probably don't want to promote degraded service to a complete outage if something starts to take just a bit longer than you expected.)
To support timeouts and cancelation of goroutine work, the standard mechanism is to use context.Context.
ctx := context.Background() // root context
// wrap the context with a timeout and/or cancelation mechanism
ctx, cancel := context.WithTimeout(ctx, 5*time.Second) // with timeout or cancel
//ctx, cancel := context.WithCancel(ctx) // no timeout just cancel
defer cancel() // avoid memory leak if we never cancel/timeout
Next your worker goroutines need to support taking and monitoring the state of the ctx. To do this in parallel with the time.Sleep (to mimic a long computation), convert the sleep to a channel based solution:
// mimic api call 1
func doSomeWork(ctx context.Context) (function1, error) {
//time.Sleep(10 * time.Second)
select {
case <-time.After(10 * time.Second):
// wait completed
case <-ctx.Done():
return function1{}, ctx.Err()
}
// ...
}
And if one worker goroutine fails, to signal to the other worker that the request should be aborted, simply call the cancel() function.
result, e := doSomeWork(ctx)
if e != nil {
cancel() // <- add this
err <- e
return
}
Pulling this all together:
https://play.golang.org/p/1Kpe_tre7XI
EDIT: the sleep example above is obviously a contrived example of how to abort a "fake" task. In the real world, http or SQL DB calls would be involve - and since go 1.7 & 1.8 - the standard library added context support to any of these potentially blocking calls:
func doSomeWork(ctx context.Context) (error)
// DB
db, err := sql.Open("mysql", "...") // check err
//rows, err := db.Query("SELECT age from users", age)
rows, err := db.QueryContext(ctx, "SELECT age from users", age)
if err != nil {
return err // will return with error if context is canceled
}
// http
// req, err := http.NewRequest("GET", "http://example.com", nil)
req, err := http.NewRequestWithContext(ctx, "GET", "http://example.com", nil) // check err
resp, err := http.DefaultClient.Do(req)
if err != nil {
return err // will return with error if context is canceled
}
}
EDIT (2): to poll a context's state without blocking, leverage select's default branch:
select {
case <-ctx.Done():
return ctx.Err()
default:
// if ctx is not done - this branch is used
}
the default branch can optional have code in it, but even if it is empty of code it's presence will prevent blocking - and thus just poll the status of the context in that instant of time.

Can we restrict function calling once at a time from goroutine

I have following situation
wg.Add(1)
go func(wg *sync.WaitGroup) {
defer wg.Done()
for {
select {
case <-tickerCR.C:
_ = ProcessCommands()
case <-ow.quitCR:
logger.Debug("Stopping ProcessCommands goroutine")
return
}
}
}(&wg)
Can I somehow make sure that if ProcessCommands is executing then ignore the next ticker event. Basically I want to avoid parallel execution of ProcessCommands
What you want is called mutual exclusion. It can be achieved by Mutex.
var m Mutex
func process() {
m.Lock()
defer m.Unlock()
ProcessCommands()
}
You could create a type that has two fields, a function and a mutex, and when called his, lets say, run method, it locks, defers the unlock and calls the stored function. Afterwards you just need to create instances of that type with the required functions. OOP to the rescue. Remember that functions can be stored in a struct the same way a string would.
import (
"sync"
)
type ProtectedCaller struct {
m sync.Mutex
f func()
}
func (caller *ProtectedCaller) Call() {
caller.m.Lock()
defer caller.m.Unlock()
caller.f()
}
func ProtectCall(f func()) ProtectedCaller {
return ProtectedCaller{f: f}
}
var processCommands = ProtectCall(ProcessCommands)
There's a semi-standard module x/sync/singleflight:
How to use:
import "golang.org/x/sync/singleflight"
var requestGroup singleflight.Group
// This handler should call it's upstream only once:
http.HandleFunc("/singleflight", func(w http.ResponseWriter, r *http.Request) {
// define request group - each request can have it's specific ID
// singleflight ensures only 1 request with any given ID is processed at a time
// also you can have different IDs - to be processed simultaneously
// just set ID to "singleflight-1", "singleflight-2", etc
res, err, shared := requestGroup.Do("singleflight", func() (interface{}, error) {
fmt.Println("calling the endpoint")
response, err := http.Get("https://jsonplaceholder.typicode.com/photos")
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return nil, err
}
responseData, err := ioutil.ReadAll(response.Body)
if err != nil {
log.Fatal(err)
}
time.Sleep(2 * time.Second)
return string(responseData), err
})
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
result := res.(string)
fmt.Println("shared = ", shared)
fmt.Fprintf(w, "%q", result)
})
source
you can use sync.Once and prevent multiple calling a function, like this:
wg.Add(1)
var once sync.Once
go func(wg *sync.WaitGroup) {
defer wg.Done()
for {
select {
case <-tickerCR.C:
// look at this line, "ProcessCommands" function will call only once
once.Do(ProcessCommands)
case <-ow.quitCR:
logger.Debug("Stopping ProcessCommands goroutine")
return
}
}
}(&wg)

Await two results concurrently with timeout

Use case
I'd like to run two queries against a database in parallel and return after a maximum time of 600ms whatever I have fetched to that point. I am struggling with implementing the concurrency for this requirement.
Code
func (s *Service) GetCustomerStats(ctx context.Context, customerUUID string) *CustomerStats {
stats := &CustomerStats{
CustomerUUID: customerUUID,
Type: "ERROR",
OrderCount: "ERROR",
}
var wg sync.WaitGroup
var mu sync.Mutex
// Get order count
wg.Add(1)
go func() {
defer wg.Done()
orderCount, err := s.Storage.GetOrderCount(ctx, customerUUID)
if err != nil {
return
}
mu.Lock()
stats.OrderCount = strconv.Itoa(orderCount)
if orderCount == 0 {
stats.OrderCount = "NA"
}
mu.Unlock()
}()
// Get customer type
wg.Add(1)
go func() {
defer wg.Done()
type, err := s.Storage.GetCustomerType(ctx, customerUUID)
if err != nil {
return
}
mu.Lock()
stats.Type = strconv.Itoa(type)
mu.Unlock()
}()
wg.Wait()
return stats
}
The problem
The context I pass into that function has a timeout of 600ms defined. I pass it on to the storage repo and the Database driver uses it as well, but it does not guarantee it will respond within that time as it does schedule some retries under the hood.
However I must ensure that this function returns within the passed context timeout (600ms). I am currently using a waitgroup to await the results but I wouldn't know how to return stats once the context is done.
Basically I am looking for something like this. My research indicates that I should probably use channels which signal that the work is done but I am not sure how I would implement that so that it's simple code.
select {
case wg.Wait()
return stats
case <-ctx.Done()
return stats
}
The way you plan to select on ctx.Done() looks correct.
It's the way you work with your mutable state that is wrong, in my opinion.
Try something like this:
var state = State{}
select {
case type <- typeChan
stats.Type = type
if (stats.OrderCount != nil) {
return stats
}
case count <- countChan
stats.OrderCount = count
if (stats.Type != nil) {
return stats
}
case <-ctx.Done()
return stats
}
Now your functions should look like this:
go func() {
orderCount, err := s.Storage.GetOrderCount(ctx, customerUUID)
if err != nil {
return // Here you probably want to have errChan
}
if orderCount == 0 {
countChan <- "NA"
} else {
countChan <- strconv.Itoa(orderCount)
}
}()
This is all a bit sketchy, since your example is quite complex, but should give you the direction to follow.

To avoid multiple database calls blocking each other in a Go web app handler, are goroutines + syncGroup the only way?

Having taken a look at several web application examples and boilerplates, the approach they take tends to be in the form of this (I'm using a Gin handler here as an example, and imaginary User and Billing "repository" structs that fetch data from either a database or an external API. I omitted error handling to make the example shorter) :
func GetUserDetailsHandler(c *gin.Context) {
//this result presumably comes from the app's database
var userResult = UserRepository.FindById( c.getInt("user_id") )
//assume that this result comes from a different data source (e.g: a different database) all together, hence why we're not just doing a join query with "User"
var billingInfo = BillingRepository.FindById( c.getInt("user_id") )
c.JSON(http.StatusOK, gin.H {
user_data : userResult,
billing_data : billingInfo,
})
return
}
In the above scenario, the call to "User.FindById" might use some kind of database driver, but as far as I'm aware, all available Golang database/ORM libraries return data in a "synchronous" fashion (e.g: as return values, not via channels). As such, the call to "User.FindById" will block until it's complete, before I can move on to executing "BillingInfo.FindById", which is not at all ideal since they can both work in parallel.
So I figured that the best idea was to make use of go routines + syncGroup to solve the problem. Something like this:
func GetUserDetailsHandler(c *gin.Context) {
var waitGroup sync.WaitGroup
userChannel := make(chan User);
billingChannel := make(chan Billing)
waitGroup.Add(1)
go func() {
defer waitGroup.Done()
userChannel <- UserRepository.FindById( c.getInt("user_id") )
}()
waitGroup.Add(1)
go func(){
defer waitGroup.Done()
billingChannel <- BillingRepository.FindById( c.getInt("user_id") )
}()
waitGroup.Wait()
userInfo := <- userChannel
billingInfo = <- billingChannel
c.JSON(http.StatusOK, gin.H {
user_data : userResult,
billing_data : billingInfo,
})
return
}
Now, this presumably does the job. But it seems unnecessarily verbose to me, and potentially error prone (if I forget to "Add" to the waitGroup before any go routine, or if I forget to "Wait", then it all falls apart). Is this the only way to do this? Or is there something simpler that I'm missing out?
maybe something like this
package main
import (
"fmt"
)
func GetUserDetailsHander(c *gin.Context) {
var userInfo USERINlFO
var billingInfo BILLL
err := parallel(
func() (e error) {
userInfo, e = UserRepository.FindById(c.getInt("user_id"))
return
},
func() (e error) {
billingInfo, e = BillingRepository.FindById(c.getInt("user_id"))
return
},
)
fmt.Println(err)
c.JSON(http.StatusOK, gin.H{
user_data: userResult,
billing_data: billingInfo,
})
return
}
func parallel(do ...func() error) error {
var err error
rcverr := make(chan error)
var wg sync.WaitGroup
for _, d := range do {
wg.Add(1)
go func(do func() error) {
rcverr <- do()
wg.Done()
}(d)
}
go func() {
wg.Wait()
close(rcverr)
}()
for range do {
e := <-rcverr
if e != nil {
err = e // return here for fast path
}
}
return err
}

Anonymous function doesn't seem to execute in Go routine

I have the following code. Pay special attention to the anonymous function:
func saveMatterNodes(matterId int, nodes []doculaw.LitigationNode) (bool, error) {
var (
err error
resp *http.Response
)
// Do this in multiple threads
for _, node := range nodes {
fmt.Println("in loops")
go func() {
postValues := doculaw.LitigationNode{
Name: node.Name,
Description: node.Description,
Days: node.Days,
Date: node.Date,
IsFinalStep: false,
Completed: false,
Matter: matterId}
b := new(bytes.Buffer)
json.NewEncoder(b).Encode(postValues)
resp, err = http.Post("http://127.0.0.1:8001/matterNode/", "application/json", b)
io.Copy(os.Stdout, resp.Body)
fmt.Println("Respone from http post", resp)
if err != nil {
fmt.Println(err)
}
}()
}
if err != nil {
return false, err
} else {
return true, nil
}
}
If I remove the go func() {}() part and just leave the code in between it seems to execute fine but the moment I add it back it does not execute. Any idea why that is? I initially thought maybe because it's executing on a different thread but this doesn't seem to be the case as I can see on my webservice access logs that it is not executing.
I think this behaviour is because function never yields back to main thread ( After you launch goroutines, there is no construct in program to wait for them to finish their work).
Use of channels, IO operations, sync.WaitGroup etc can yield control back to the main thread.
You may want to try sync.WaitGroup
Example: https://play.golang.org/p/Zwn0YBynl2

Resources