Keep retrying a function in Golang - go

I am trying to make a functionality which would work in the following manner:
As soon as the service function is called, it uses the Fetch function to get records from a service (which come in the form of byte array), JSON unmarshal the byte array, populate the struct and then send the struct to a DB function to save to database.
Now, since this needs to be a continuous job, I have added two if conditions such that, if the records received are of length 0, then we use the retry function to retry pulling the records, else we just write to the database.
I have been trying to debug the retry function for a while now, but it is just not working, and basically stops after the first retry (even though I specify the attempts as 100). What can I do to make sure, it keeps retrying pulling the records ?
The code is as Follows:
// RETRY FUNCTION
func retry(attempts int, sleep time.Duration, f func() error) (err error) {
for i := 0; ; i++ {
err = f()
if err == nil {
return
}
if i >= (attempts - 1) {
break
}
time.Sleep(sleep)
sleep *= 2
log.Println("retrying after error:", err)
}
return fmt.Errorf("after %d attempts, last error: %s", attempts, err) }
//Save Data function
type Records struct {
Messages [][]byte
}
func (s *Service) SaveData(records Records, lastSentPlace uint) error {
//lastSentPlace is sent as 0 to begin with.
for i := lastSentPlace; i <= records.Place-1; i++ {
var msg Records
msg.Unmarshal(records.Messages[i])
order := MyStruct{
Fruit: msg.Fruit,
Burger: msg.Burger,
Fries: msg.Fries,
}
err := s.db.UpdateOrder(context.TODO(), nil , order)
if err != nil {
logging.Error("Error occured...")
}
}return nil}
//Service function (This runs as a batch, which is why we need retrying)
func (s *Service) MyServiceFunction(ctx context.Context, place uint, length uint) (err error) {
var lastSentPlace = place
records, err := s.Poll(context.Background(), place, length)
if err != nil {
logging.Info(err)
}
// if no records found then retry.
if len(records.Messages) == 0 {
err = retry(100, 2*time.Minute, func() (err error) {
records, err := s.Poll(context.Background(), place, length)
// if data received, write to DB
if len(records.Messages) != 0 {
err = s.SaveData(records, lastSentPlace)
}
return
})
// if data is not received, or if err is not null, retry
if err != nil || len(records.Messages) == 0 {
log.Println(err)
return
}
// if data received on first try, then no need to retry, write to db
} else if len(records.Messages) >0 {
err = s.SaveData(records, lastSentPlace)
if err != nil {
return err
}
}
return nil }
I think, the issue is with the way I am trying to implement the retry function, I have been trying to debug this for a while, but being new to the language, I am really stuck. What I wanted to do was, implement a backoff if no records are found. Any help is greatly appreciated.
Thanks !!!

I make a simpler retry.
Use simpler logic for loop to ensure correctness.
We sleep before executing a retry, so use i > 0 as the condition for the sleeping.
Here's the code:
func retry(attempts int, sleep time.Duration, f func() error) (err error) {
for i := 0; i < attempts; i++ {
if i > 0 {
log.Println("retrying after error:", err)
time.Sleep(sleep)
sleep *= 2
}
err = f()
if err == nil {
return nil
}
}
return fmt.Errorf("after %d attempts, last error: %s", attempts, err)
}

I know this is an old question, but came across it when searching for retries and used it as the base of a solution.
This version can accept a func with 2 return values and uses generics in golang 1.18 to make that possible. I tried it in 1.17, but couldn't figure out a way to make the method generic.
This could be extended to any number of return values of any type. I have used any here, but that could be limited to a list of types.
func retry[T any](attempts int, sleep int, f func() (T, error)) (result T, err error) {
for i := 0; i < attempts; i++ {
if i > 0 {
log.Println("retrying after error:", err)
time.Sleep(time.Duration(sleep) * time.Second)
sleep *= 2
}
result, err = f()
if err == nil {
return result, nil
}
}
return result, fmt.Errorf("after %d attempts, last error: %s", attempts, err)
}
Usage example:
var config Configuration
something, err := retry(config.RetryAttempts, config.RetrySleep, func() (Something, error) { return GetSomething(config.Parameter) })
func GetSomething(parameter string) (something Something, err error) {
// Do something flakey here that might need a retry...
return something, error
}
Hope that helps someone with the same use case as me.

The function you are calling is using a context. So it is important that you handle that context.
If you don't know what a context is and how to use it, I would recomend that post: https://blog.golang.org/context
Your retry function should also handle the context. Just to get you on the track I give you a simple implementation.
func retryMyServiceFunction(ctx context.Context, place uint, length uint, sleep time.Duration) {
for {
select {
case ctx.Done():
return
default:
err := MyServiceFunction(ctx, place, length)
if err != nil {
log.Println("handle error here!", err)
time.Sleep(sleep)
} else {
return
}
}
}
}
I don't like the sleep part. So you should analyse the returned error. Also you have to think about timeouts. When you let your service sleep to long there could be a timeout.

There is a library for the retry mechanism.
https://github.com/avast/retry-go
url := "http://example.com"
var body []byte
err := retry.Do(
func() error {
resp, err := http.Get(url)
if err != nil {
return err
}
defer resp.Body.Close()
body, err = ioutil.ReadAll(resp.Body)
if err != nil {
return err
}
return nil
},
)
fmt.Println(body)

In the GoPlayground in the comments of the accepted answer, there are some things I would consider adding. Using continue and break in the for loop would make the loop even simpler by not using the if i > 0 { statement. Furthermore I would use early return in all the functions to directly return on an error. And last I would consistently use errors to check if a function failed or not, checking the validity of a value should be inside the executed function itself.
This would be my little attempt:
package main
import (
"errors"
"fmt"
"log"
"time"
)
func main() {
var complicatedFunctionPassing bool = false
var attempts int = 5
// if complicatedFunctionPassing is true retry just makes one try
// if complicatedFunctionPassing is false retry makes ... attempts
err := retry(attempts, time.Second, func() (err error) {
if !complicatedFunctionPassing {
return errors.New("somthing went wrong in the important function")
}
log.Println("Complicated function passed")
return nil
})
if err != nil {
log.Printf("failed after %d attempts with error: %s", attempts, err.Error())
}
}
func retry(attempts int, sleep time.Duration, f func() error) (err error) {
for i := 0; i < attempts; i++ {
fmt.Println("This is attempt number", i+1)
// calling the important function
err = f()
if err != nil {
log.Printf("error occured after attempt number %d: %s", i+1, err.Error())
log.Println("sleeping for: ", sleep.String())
time.Sleep(sleep)
sleep *= 2
continue
}
break
}
return err
}
You can try it out here:
https://go.dev/play/p/Ag8ObCb980U

Related

How to make this InTx func (for SQL transactions) "safe" if there is a panic during callback?

Playground link: https://go.dev/play/p/laQo-BfF7sK
It's subtle, but this InTx "context manager" (in transaction) has at least one bug. If there is a panic during the "Fun" call:
type Fun func(context.Context, *sql.Tx) error
func InTx(db *sql.DB, fn Fun) error {
ctx := context.Background()
t, err := db.BeginTx(ctx, nil)
if err != nil {
log.Panicln(err)
return err
}
return safe(ctx, t, fn)
}
// safe should run the provided function in the context of a SQL transaction
// expect a nil error if (and only if) everything worked w/o incident
func safe(ctx context.Context, t *sql.Tx, fn Fun) (err error) {
defer func() {
if err == nil {
err = t.Commit()
return
}
if bad := t.Rollback(); bad != nil && bad != sql.ErrTxDone {
err = fmt.Errorf("during rollback, panic(%v); err=%w", bad, err)
// log error
return
}
}()
err = fn(ctx, t)
return
}
Here is an example to demonstrate:
func main() {
var db *sql.DB;
// ...
_ = InTx(db, func(ctx context.Context, t *sql.Tx) error {
// ... lots more SQL executed here ...
if _, err := t.Exec("DELETE FROM products"); err != nil {
return err
}
// ...
panic("will cause Commit")
// should expect Rollback() instead, as if we:
//return nil
})
}
Related: Would it be inappropriate to panic during another panic, e.g. if Rollback fails? If so, why? (or when not)
Adding recover in another defer (after the first one in the safe function, since they unwind in stack order) would guard against an "inner" panic from the callback, but that may be sub-optimal or less idiomatic that other approaches.
defer func() {
if veryBad := recover(); veryBad != nil {
bad := t.Rollback()
err = fmt.Errorf("aborted SQL due to panic: %v; err=%w", veryBad, bad)
// log error, should re-panic here?
return
}
}()
I'd be very happy to accept someone else's Go wisdom in lieu of my potentially-flawed approach.

Check if all goroutines have finished without using wg.Wait()

Let's say I have a function IsAPrimaryColour() which works by calling three other functions IsRed(), IsGreen() and IsBlue(). Since the three functions are quite independent of one another, they can run concurrently. The return conditions are:
If any of the three functions returns true, IsAPrimaryColour()
should also return true. There is no need to wait for the other
functions to finish. That is: IsPrimaryColour() is true if IsRed() is true OR IsGreen() is true OR IsBlue() is true
If all functions return false, IsAPrimaryColour() should also return
false. That is: IsPrimaryColour() is false if IsRed() is false AND IsGreen() is false AND IsBlue() is false
If any of the three functions returns an error, IsAPrimaryColour()
should also return the error. There is no need to wait for the other
functions to finish, or to collect any other errors.
The thing I'm struggling with is how to exit the function if any other three functions return true, but also to wait for all three to finish if they all return false. If I use a sync.WaitGroup object, I will need to wait for all 3 go routines to finish before I can return from the calling function.
Therefore, I'm using a loop counter to keep track of how many times I have received a message on a channel and existing the program once I have received all 3 messages.
https://play.golang.org/p/kNfqWVq4Wix
package main
import (
"errors"
"fmt"
"time"
)
func main() {
x := "something"
result, err := IsAPrimaryColour(x)
if err != nil {
fmt.Printf("Error: %v\n", err)
} else {
fmt.Printf("Result: %v\n", result)
}
}
func IsAPrimaryColour(value interface{}) (bool, error) {
found := make(chan bool, 3)
errors := make(chan error, 3)
defer close(found)
defer close(errors)
var nsec int64 = time.Now().UnixNano()
//call the first function, return the result on the 'found' channel and any errors on the 'errors' channel
go func() {
result, err := IsRed(value)
if err != nil {
errors <- err
} else {
found <- result
}
fmt.Printf("IsRed done in %f nanoseconds \n", float64(time.Now().UnixNano()-nsec))
}()
//call the second function, return the result on the 'found' channel and any errors on the 'errors' channel
go func() {
result, err := IsGreen(value)
if err != nil {
errors <- err
} else {
found <- result
}
fmt.Printf("IsGreen done in %f nanoseconds \n", float64(time.Now().UnixNano()-nsec))
}()
//call the third function, return the result on the 'found' channel and any errors on the 'errors' channel
go func() {
result, err := IsBlue(value)
if err != nil {
errors <- err
} else {
found <- result
}
fmt.Printf("IsBlue done in %f nanoseconds \n", float64(time.Now().UnixNano()-nsec))
}()
//loop counter which will be incremented every time we read a value from the 'found' channel
var counter int
for {
select {
case result := <-found:
counter++
fmt.Printf("received a value on the results channel after %f nanoseconds. Value of counter is %d\n", float64(time.Now().UnixNano()-nsec), counter)
if result {
fmt.Printf("some goroutine returned true\n")
return true, nil
}
case err := <-errors:
if err != nil {
fmt.Printf("some goroutine returned an error\n")
return false, err
}
default:
}
//check if we have received all 3 messages on the 'found' channel. If so, all 3 functions must have returned false and we can thus return false also
if counter == 3 {
fmt.Printf("all goroutines have finished and none of them returned true\n")
return false, nil
}
}
}
func IsRed(value interface{}) (bool, error) {
return false, nil
}
func IsGreen(value interface{}) (bool, error) {
time.Sleep(time.Millisecond * 100) //change this to a value greater than 200 to make this function take longer than IsBlue()
return true, nil
}
func IsBlue(value interface{}) (bool, error) {
time.Sleep(time.Millisecond * 200)
return false, errors.New("something went wrong")
}
Although this works well enough, I wonder if I'm not overlooking some language feature to do this in a better way?
errgroup.WithContext can help simplify the concurrency here.
You want to stop all of the goroutines if an error occurs, or if a result is found. If you can express “a result is found” as a distinguished error (along the lines of io.EOF), then you can use errgroup's built-in “cancel on first error” behavior to shut down the whole group:
func IsAPrimaryColour(ctx context.Context, value interface{}) (bool, error) {
var nsec int64 = time.Now().UnixNano()
errFound := errors.New("result found")
g, ctx := errgroup.WithContext(ctx)
g.Go(func() error {
result, err := IsRed(ctx, value)
if result {
err = errFound
}
fmt.Printf("IsRed done in %f nanoseconds \n", float64(time.Now().UnixNano()-nsec))
return err
})
…
err := g.Wait()
if err == errFound {
fmt.Printf("some goroutine returned errFound\n")
return true, nil
}
if err != nil {
fmt.Printf("some goroutine returned an error\n")
return false, err
}
fmt.Printf("all goroutines have finished and none of them returned true\n")
return false, nil
}
(https://play.golang.org/p/MVeeBpDv4Mn)
some remarks,
you dont need to close the channels, you know before hand the expected count of signals to read. This is sufficient for an exit condition.
you dont need to duplicate manual function calls, use a slice.
since you use a slice, you dont even need a counter, or a static value of 3, just look at the length of your func slice.
that default case into the switch is useless. just block on the input you are waiting for.
So once you got ride of all the fat, the code looks like
func IsAPrimaryColour(value interface{}) (bool, error) {
fns := []func(interface{}) (bool, error){IsRed, IsGreen, IsBlue}
found := make(chan bool, len(fns))
errors := make(chan error, len(fns))
for i := 0; i < len(fns); i++ {
fn := fns[i]
go func() {
result, err := fn(value)
if err != nil {
errors <- err
return
}
found <- result
}()
}
for i := 0; i < len(fns); i++ {
select {
case result := <-found:
if result {
return true, nil
}
case err := <-errors:
if err != nil {
return false, err
}
}
}
return false, nil
}
you dont need to obsereve the time at the each and every async calls, just observe the time the overall caller took to return.
func main() {
now := time.Now()
x := "something"
result, err := IsAPrimaryColour(x)
if err != nil {
fmt.Printf("Error: %v\n", err)
} else {
fmt.Printf("Result: %v\n", result)
}
fmt.Println("it took", time.Since(now))
}
https://play.golang.org/p/bARHS6c6m1c
The idiomatic way to handle multiple concurrent function calls, and cancel any outstanding after a condition, is with the use of a context value. Something like this:
func operation1(ctx context.Context) bool { ... }
func operation2(ctx context.Context) bool { ... }
func operation3(ctx context.Context) bool { ... }
func atLeastOneSuccess() bool {
ctx, cancel := context.WithCancel(context.Background()
defer cancel() // Ensure any functions still running get the signal to stop
results := make(chan bool, 3) // A channel to send results
go func() {
results <- operation1(ctx)
}()
go func() {
results <- operation2(ctx)
}()
go func() {
results <- operation3(ctx)
}()
for i := 0; i < 3; i++ {
result := <-results
if result {
// One of the operations returned success, so we'll return that
// and let the deferred call to cancel() tell any outstanding
// functions to abort.
return true
}
}
// We've looped through all return values, and they were all false
return false
}
Of course this assumes that each of the operationN functions actually honors a canceled context. This answer discusses how to do that.
You don't have to block the main goroutine on the Wait, you could block something else, for example:
doneCh := make(chan struct{}{})
go func() {
wg.Wait()
close(doneCh)
}()
Then you can wait on doneCh in your select to see if all the routines have finished.

Best approach to getting results out of goroutines

I have two functions that I cannot change (see first() and second() below). They are returning some data and errors (the output data is different, but in the examples below I use (string, error) for simplicity)
I would like to run them in separate goroutines - my approach:
package main
import (
"fmt"
"os"
)
func first(name string) (string, error) {
if name == "" {
return "", fmt.Errorf("empty name is not allowed")
}
fmt.Println("processing first")
return fmt.Sprintf("First hello %s", name), nil
}
func second(name string) (string, error) {
if name == "" {
return "", fmt.Errorf("empty name is not allowed")
}
fmt.Println("processing second")
return fmt.Sprintf("Second hello %s", name), nil
}
func main() {
firstCh := make(chan string)
secondCh := make(chan string)
go func() {
defer close(firstCh)
res, err := first("one")
if err != nil {
fmt.Printf("Failed to run first: %v\n", err)
}
firstCh <- res
}()
go func() {
defer close(secondCh)
res, err := second("two")
if err != nil {
fmt.Printf("Failed to run second: %v\n", err)
}
secondCh <- res
}()
resultsOne := <-firstCh
resultsTwo := <-secondCh
// It's important for my app to do error checking and stop if errors exist.
if resultsOne == "" || resultsTwo == "" {
fmt.Println("There was an ERROR")
os.Exit(1)
}
fmt.Println("ONE:", resultsOne)
fmt.Println("TWO:", resultsTwo)
}
I believe one caveat is that resultsOne := <- firstCh blocks until first goroutine finishes, but I don't care too much about this.
Can you please confirm that my approach is good? What other approaches would be better in my situation?
The example looks mostly good. A couple improvements are:
declaring your channels as buffered
firstCh := make(chan string, 1)
secondCh := make(chan string, 1)
With unbuffered channels, send operations block (until someone receives). If your goroutine #2 is much faster than the first, it will have to wait until the first finishes as well, since you receive in sequence:
resultsOne := <-firstCh // waiting on this one first
resultsTwo := <-secondCh // sender blocked because the main thread hasn't reached this point
use "golang.org/x/sync/errgroup".Group. The program will feel "less native" but it dispenses you from managing channels by hand — which trades, in a non-contrived setting, for sync'ing writes on the results:
func main() {
var (
resultsOne string
resultsTwo string
)
g := errgroup.Group{}
g.Go(func() error {
res, err := first("one")
if err != nil {
return err
}
resultsOne = res
return nil
})
g.Go(func() error {
res, err := second("two")
if err != nil {
return err
}
resultsTwo = res
return nil
})
err := g.Wait()
// ... handle err

Simulate an HTTP request with success or failure using retry logic

I want to simulate a re-try option with http like:
first two http attempts with error (using some faulty urls)
the third with success (with valid url)
This is a bit tricky any idea how to do it? I try with loop on the doSomething method with different url but it doesn't make the point,
which is for example, retry at least 3 times until you get http 200, (success) any idea how could I simulate it?
maybe run in loop on following...
www.stackoverflow.com2
www.stackoverflow.com1
www.stackoverflow.com
https://play.golang.org/p/dblPh1T0XBu
package main
import (
`fmt`
`log`
"net/http"
`time`
`github.com/cenkalti/backoff/v4`
)
func main() {
b := backoff.NewExponentialBackOff()
b.MaxElapsedTime = 3 * time.Second
retryable := func() error {
val, err := doSomething("https://www.google.com1")
if err != nil {
return err
}
fmt.Println(val)
return nil
}
notify := func(err error, t time.Duration) {
log.Printf("error: %v happened at time: %v", err, t)
}
err := backoff.RetryNotify(retryable, b, notify)
if err != nil {
fmt.Errorf("error after retrying: %v", err)
}
}
func doSomething(url string) (int, error) {
res, e := http.Get(url)
if e != nil {
fmt.Println("error occurred: ", e)
return 500, e
}
return res.StatusCode, nil
}
The idea on the comment below is part of the problem, I need to use the http calls
https://play.golang.org/p/FTR7J2r-QB7
package main
import (
`fmt`
`log`
`time`
`github.com/cenkalti/backoff/v4`
)
func main() {
b := backoff.NewExponentialBackOff()
b.MaxElapsedTime = 3 * time.Second
retrybuilder := func (count int) func() error {
return func() error {
var succeed bool
count -= 1
if count == 0 {
succeed = true
}
val, err := doSomething(succeed)
if err != nil {
fmt.Println("response: ", val)
}
return err
}
}
notify := func(err error, t time.Duration) {
log.Printf("error: %v happened at time: %v", err, t)
}
err := backoff.RetryNotify(retrybuilder(3), b, notify)
if err != nil {
fmt.Printf("error after retrying: %v", err)
}
}
func doSomething(succeed bool) (int, error) {
if !succeed {
return 500, fmt.Errorf("E_SIMULATED: sim error")
}
return 200, nil
}

Almost Repeating Myself

Combinitorial Explosion You have lots of code that does almost the same thing.. but with tiny variations in data or behavior. This can be difficult to refactor-- perhaps using generics or an interpreter? - Jeff Atwood via Coding Horror
In this case it is not lots of code, but it is still bugging me. I have a shared problem, that is when trying to connect to an IP, if it fails, I should retry with the next IP.
I have one function which generates a producer for NSQ:
//Since we are in a critical system, we try with each IP until we get a producer
var err error
for i, success := 0, false; i < len(ips) && !success; i++ {
publisher, err = nsq.NewProducer(ips[i], nsq.NewConfig())
if err == nil {
success = true
}
}
The other function that almost shares the same code is one which takes a NSQ consumer and connects it:
var err error
for i, success := 0, false; i < len(ips) && !success; i++ {
err = consumer.ConnectToNSQD(ips[i])
if err == nil {
success = true
}
}
I would like to get rid of this almost repeated code without sacrificing legibility. Ideas?
You have it backwards. Your solution should follow the shape of the problem, not the shape of a particular solution. There's nothing in the solution that's worth refactoring. It's just going to add pointless complexity.
For example,
package main
import "github.com/nsqio/go-nsq"
// NewProducer is nsq.NewProducer with retries of an address list.
func NewProducer(addrs []string, config *nsq.Config) (producer *nsq.Producer, err error) {
if len(addrs) == 0 {
addrs = append(addrs, "")
}
for _, addr := range addrs {
producer, err = nsq.NewProducer(addr, config)
if err == nil {
break
}
}
return producer, err
}
// ConnectToNSQD is nsq.ConnectToNSQD with retries of an address list.
func ConnectToNSQD(c *nsq.Consumer, addrs []string) (err error) {
if len(addrs) == 0 {
addrs = append(addrs, "")
}
for _, addr := range addrs {
err = c.ConnectToNSQD(addr)
if err == nil {
break
}
}
return err
}
func main() {}
Maybe something like this?
var publisher *nsq.Producer
connectToWorkingIP(ips, func(ip string) error {
var err error
publisher, err = nsq.NewProducer(ip, nsq.NewConfig())
return err
})
connectToWorkingIP(ips, func(ip string) error {
return consumer.ConnectToNSQD(ip)
})
func connectToWorkingIP(ips []string, f func(string) error) {
for i, success := 0, false; i < len(ips) && !success; i++ {
err := f(ips[i])
if err == nil {
success = true
}
}
}

Resources