Retry on redis connection failure - go

Wondering why redigo decided not to export the errorConn type, which would allow applications to have specific error handling for connection failures. As implemented, applications have to handle these as generic errors.
For example, our application generally doesn't care if a single PUT fails, but if the issue is a redis connection failure or redis pool being exhausted, moving on to the next PUT (especially if it requires opening a new connection) is a bad idea. We should stop and retry (with exponential backoff) until the connection comes back.
Code example where redigo returns a generic error if the connection pool is exhausted

The lines of code in your link return two values of the respective types: (Conn, error).
if !p.Wait && p.MaxActive > 0 && p.active >= p.MaxActive {
p.mu.Unlock()
return errorConn{ErrPoolExhausted}, ErrPoolExhausted
}
The type Conn is an interface with an Err method.
// Err returns a non-nil value when the connection is not usable.
Err() error
So to obtain the underlying error, you can either:
call the Err method on the first return value; or
check the second error return value.
As a side note, the recommended way to compare errors is by using errors.Is and/or errors.As from the standard library errors package.

Related

Is it possible to miss events when opening a WebSocket?

I am looking at the properties on WebSocket. Specifically I want to handle connection errors, but I am a little perturbed by the fact that I can only set the listener property after the connection establishment has been started. Is there any risk here that I might miss for example an onerror message triggered by the connection attempt? eg no network interface
Looking at the spec. it sounds like the connection attempt will happen in parallel/asynch, and I know that will anyway be pretty slow in comparison to executing a couple extra lines of code, even though my connection is LAN based. My concern is that I am calling the js functions from WASM compiled down from Go. And I know there is some performance overhead in these calls across the language boundaries. Should I be concerned?
example code from MDN:
// Create WebSocket connection.
const socket = new WebSocket('ws://localhost:8080');
// Connection opened
socket.addEventListener('open', function (event) {
socket.send('Hello Server!');
});
example source
what I am doing:
webSocketCon := js.Global().Get("WebSocket")
ws := webSocketCon.New(url)
handlerError := js.FuncOf(func(this js.Value, args []js.Value) interface{} {
//code to handle error
return nil
})
ws.Set("onerror", handlerError)

Idiomatic way to handle logic error vs programming error in golang

I have been using golang to automate some deploy processes and I had to use exec package to call some bash scripts.
I used exec.Command("/home/rodrigo/my-deploy.sh").CombinedOutput() and I saw his implementation
func (c *Cmd) CombinedOutput() ([]byte, error) {
if c.Stdout != nil {
return nil, errors.New("exec: Stdout already set")
}
if c.Stderr != nil {
return nil, errors.New("exec: Stderr already set")
}
var b bytes.Buffer
c.Stdout = &b
c.Stderr = &b
err := c.Run()
return b.Bytes(), err
}
I realized you can't assign c.Stdout when using CombinedOutput() and I think that's ok but the way it is informed to the api caller is not correct.
CombinedOutput() return an error when you are using it in a bad way, so if you are going to use CombinedOutput() then you shouldn't assign c.Stderr or c.Stdout previously, if you do that then you are going to receive an error.
But this error is not because your script throw an error, it is because you are using the api wrong, in that case I believe you should get a panic because a bad api usage should not being handled (I think).
I come from Java World and when you are using some method in the wrong way then you receive an RuntimeException, for example.
public void run(Job job) throws NotCompletedJob {
if (job.getId() != null) {
throw new IllegalArgumentException("This job should not have id");
}
job.setId(calculateId());
job.run();
}
With this signature I can know I'm wrong calling run(obj); with a Job that has an id and in fact I can distinguish if there is an error with my script or I'm using api in a wrong way.
NotCompletedJob is a checked exception so I must handle it but IllegalArgumentException is not so I could get it anytime. Catching IllegalArgumentException or any other RuntimeException is not always considered a good practice because they are indicating you have an error from programmers's point of view and it is not a possible expected error like NotCompletedJob.
Having said that, How can I differentiate between a programming error (bad api usages for example) from an expected error (script doesn't finished ok) with current CombinedOutput() implementation ?
To clarify my concern, I'm not saying that is wrong the current implementation of CombinedOuput, but I don't understand how the caller could distinguish if it is an error of the command being executed or an error caused from his bad api usage.
I believe that a best approach would be to panicking when the caller is using the api in a wrong way, as the same case when the caller is passing a nil reference to a function that expect a non nil reference (in fact this is the current behaviour).
I come from Java World and when you are using some method in the wrong
way then you receive an RuntimeException.
You are in the Go world now. Therefore, that argument is invalid. Abandon Java.
The Go Programming Language Specification
Handling panics
Two built-in functions, panic and recover, assist in reporting and
handling run-time panics and program-defined error conditions.
func panic(interface{})
func recover() interface{}
While executing a function F, an explicit call to panic or a run-time
panic terminates the execution of F. Any functions deferred by F are
then executed as usual. Next, any deferred functions run by F's caller
are run, and so on up to any deferred by the top-level function in the
executing goroutine. At that point, the program is terminated and the
error condition is reported, including the value of the argument to
panic. This termination sequence is called panicking.
The Go Blog
Defer, Panic, and Recover
The convention in the Go libraries is that even when a package uses
panic internally, its external API still presents explicit error
return values.
Go Code Review Comments
This page collects common comments made during reviews of Go code, so
that a single detailed explanation can be referred to by shorthands.
This is a laundry list of common mistakes, not a style guide.
Don't Panic
See https://golang.org/doc/effective_go.html#errors. Don't use panic
for normal error handling. Use error and multiple return values.
Effective Go
Errors
The usual way to report an error to a caller is to return an error as
an extra return value.
Your Go server program is concurrently handling 100,000 clients. If an error occurs, report and handle it; always check for errors. DON'T crash all 100,000 clients with a panic. Go packages should not panic.
Read the Go documentation and the Go standard library code.

Should I use panic or return error?

Go provides two ways of handling errors, but I'm not sure which one to use.
Assuming I'm implementing a classic ForEach function which accepts a slice or a map as an argument. To check whether an iterable is passed in, I could do:
func ForEach(iterable interface{}, f interface{}) {
if isNotIterable(iterable) {
panic("Should pass in a slice or map!")
}
}
or
func ForEach(iterable interface{}, f interface{}) error {
if isNotIterable(iterable) {
return fmt.Errorf("Should pass in a slice or map!")
}
}
I saw some discussions saying panic() should be avoided, but people also say that if program cannot recover from error, you should panic().
Which one should I use? And what's the main principle for picking the right one?
You should assume that a panic will be immediately fatal, for the entire program, or at the very least for the current goroutine. Ask yourself "when this happens, should the application immediately crash?" If yes, use a panic; otherwise, use an error.
Use panic.
Because your use case is to catch a bad use of your API. This should never happen at runtime if the program is calling your API properly.
In fact, any program calling your API with correct arguments will behave in the same way if the test is removed. The test is there only to fail early with an error message helpful to the programmer that did the mistake. Ideally, the panic might be reached once during development when running the testsuite and the programmer would fix the call even before committing the bad code, and that incorrect use would never reach production.
See also this reponse to question Is function parameter validation using errors a good pattern in Go?.
I like the way it's done in some libraries where on top of a regular method DoSomething, its "panicky" version is added with MustDoSomething. I'm relatively new to go, but I've already seen it in several places, notably sqlx.
In general, if you want to expose your code to someone else, you should either have Must- and a regular version of the method, or your methods/functions should give the client a chance to recover the way they want and so error should be available to them in a go-idiomatic way.
Having said that, I agree that if your API/library is used inappropriately, it's Ok to panic as well. As a matter of fact, I've also seen methods like MustGetenv() that will panic if a critical env.var is missing. Fail-fast mechanism basically.
If some mandatory requirement is not provided or not there while starting the service (eg. database connection, some service configuration which is required) then you should use panic.
There should be return error for any user response or server side error.
Ask yourself these questions:
Do you expect the exceptional situation to occur, regardless how well would you code your app? Do you think it should be useful to make the user aware of such condition as part of the normal usage of your app? Handle it as an error, because it concerns the application as working normally.
Should that exceptional situation NOT occur if you code appropriately (and somewhat defensively)? (example: dividing by zero, or accessing an array element out of bounds) Is your app totally clueless under that error? Panic.
Do you have your API and want to ensure users use it appropriately? Panic. Your API will seldom recover if used incorrectly.
Use error whenever possible
Only use panic when your code could end up in a bad state that would be prone to crashing; something truly unexpected. The example above with ForEach() is an exported func that accepts an interface so it should expect someone will improperly call it. And if it is improperly called, you know why you cannot continue and you know how to handle that error. isNotIterable is literally binary and easy to control.
But error is not like a try/catch
Even if you try to justify panic/recover by looking at throw/catch from other languages, you still use errors. We know you are trying the function because you are calling it, we know there was an error because err != nil, and just like checking the type of exception thrown you can check the type of error returned with errors.Is(err, ErrNotIterable)
So should you use panic for errors in concurrency?
The answer is still most likely no. Errors are still the preferred way in Go and you can use a wait group to shut down the goroutines:
ctx, cancel := context.WithTimeout(context.Background(), time.Minute*5)
// automatically cancel in 5 min
defer cancel()
errGroup, ctx := errgroup.WithContext(ctx)
errGroup.Go(func() error {
// do crazy stuff; you can still check for errors
if ... {
return fmt.Errorf("critical error, stopping all goroutines now")
}
// code completed without issues
return nil
})
err = errGroup.Wait()
Even using the structure of the original example, you still have better control with errors than panics:
func ForEach(iterable interface{}, f interface{}) error {
if isNotIterable(iterable) {
return fmt.Errorf("expected something iterable but got %v", reflect.ValueOf(iterable).String())
}
switch v.Kind() {
case reflect.Map:
...
case reflect.Array, reflect.Slice:
...
default:
return fmt.Errorf("isNotIterable is false but I do not know how to iterate through %v", reflect.ValueOf(iterable).String())
}
But error feels very verbose
Yes, that is the point. When an error is returned, it is at that point to do something about it. You are giving the calling code options rather than making the decision to start shutting down and killing the application unless you recover(). If you are just returning the same error all the way up the call stack then error will seem inferior to panic, but this is due to not addressing issues when they happen.
So when to use panic?
When your code is on a collision course to crash and you cannot assume your way out of it. Another is when the code assumes something that is no longer true and having to check the integrity in every function from here on out would be tedious (and might impact performance). Still, you would use panic() only to get out of the layers of uncertainty... then still handle errors:
func ForEach(iterable interface{}, f interface{}) error {
defer func() {
if r := recover(); r != nil {
err = fmt.Errorf("cannot iterate due to unexpected runtime error %v", r)
return
}
}()
...
// perhaps a broken pipe in a global var
// or an included module threw a panic at you!
}
But if you are still not convinced... Here is the Go FAQ
We believe that coupling exceptions to a control structure, as in the try-catch-finally idiom, results in convoluted code. It also tends to encourage programmers to label too many ordinary errors, such as failing to open a file, as exceptional.
Go takes a different approach. For plain error handling, Go's multi-value returns make it easy to report an error without overloading the return value. A canonical error type, coupled with Go's other features, makes error handling pleasant but quite different from that in other languages.
A panic typically means something went unexpectedly wrong. Mostly used to fail fast on errors that shouldn’t occur during normal operation, or that we aren’t prepared to handle gracefully. So in this case just return the error, you don't want your program to panic.
I think none of the previous answers are correct:
By default, if we don't know what to do with the "error" code must panic following best programming patterns:
https://en.wikipedia.org/wiki/Fail-fast
Putting it more formally, our "Turing Machine" is broken and we need to come back to an "stable state" or "reset state". More info at https://en.wikipedia.org/wiki/Reset_(computing)
For example in web (micro)services that means returning a 40X error (panic caused by input from user) or 50X error (panic caused by something else - hardware, network, assert error, ...)
If we know what to do with the "error", then we do not have an error in first place, but an uncomfortable return value. This is a normal execution condition and probably not an error. Normally this correspond to the happy vs non-happy path modeling.
In a summary, the err return value is mostly a wrong idea, even if the GO community has adopted it as a religion. Using error return values is just a patchy way to speed up program execution since it require fewer CPU instructions to be implemented, but most of the time, except for low-level services, it is useless and promote dirty code. (note that GO was designed to implement those low-level services as an "easy-C", but it was adopted for high-level (Level 7) application programs when an error must fail fast to avoid continuing with undefined states that can potentially cause money being lost of fatal casualties. In case of doubt, default to panic.
Don't use panic for normal error handling. Use error and multiple return values. See https://golang.org/doc/effective_go.html#errors.

What's the best way to determine when an RPC session ends using a StreamClientInterceptor?

When writing a StreamClientInterceptor function, what's the best way to determine when an invoker finishes the RPC? This is straightforward enough with unary interceptors or on the server-side where you're passed a handler that performs the RPC, but it's not clear how best to do this on the client-side where you return a ClientStream that the invoker then interacts with.
One use case for this is instrumenting OpenTracing, where the goal is to start and finish a span to mark the beginning and end of the RPC.
A strategy I'm looking into is having the stream interceptor return a decorated ClientStream. This new ClientStream considers the RPC to have completed if any of the interface methods Header, CloseSend, SendMsg, RecvMsg return an error or if the Context is cancelled. Additionally, it adds this logic to RecvMsg:
func (cs *DecoratedClientStream) RecvMsg(m interface{}) error {
err := cs.ClientStream.RecvMsg(m)
if err == io.EOF {
// Consider the RPC as complete
return err
} else if err != nil {
// Consider the RPC as complete
return err
}
if !cs.isResponseStreaming {
// Consider the RPC as complete
}
return err
}
It would work in most cases, but my understanding is that an invoker isn't required to call Recv if it knows the result will be io.EOF (See Are you required to call Recv until you get io.EOF when interacting with grpc.ClientStreams?), so it wouldn't work in all cases. Is there a better way to accomplish this?
I had a very similar issue where I wanted to trace streaming gRPC calls. Other than decorating the stream as you mentioned yourself, I was not able to find a good way to detect the end of streams. That is, until I came across the stats hooks provided by grpc-go (https://godoc.org/google.golang.org/grpc/stats). Even though the stats API is meant for gathering statistics about the RPC calls, the hooks it provides are very helpful for tracing as well.
If you're still looking for a way to trace streaming calls, I have written a library for OpenTracing instrumentation of gRPC, using the stats hooks:
https://github.com/charithe/otgrpc. However, please bear in mind that this approach is probably not suitable for systems that create long-lived streams.

is it wrong to treat panic / recover as throw / catch

Speaking as a new go enthusiast trying to work with the go way of error handling. To be clear - I like exceptions.
I have a server that accepts a connection , processes a set of requests and replies to them. I found that I can do
if err != nil{
panic(err)
}
in the deep down processing code
and have
defer func() {
if err := recover(); err != nil {
log.Printf("%s: %s", err, debug.Stack()) // line 20
}
}()
in the client connection code (each connection is in a goroutine). This nicely wraps everything up, forcefully closes the connection (other defers fire) and my server continues to hum along.
But this feels an awful lot like a throw/catch scenario - which golang states it doesn't support. Questions
is this stable. ie recovering a panic is an OK thing to do as an
ongoing way of life. Its not intended to just slightly defer an
immediate shutdown
I looked for a discussion on this topic and did not find it anywhere - any pointers?
I feel that the answer is 'yes it works' and can be used inside you own code, but panic should NOT be used by a library intended for wider use. The standard and polite way for a library to behave is by error returns
Yes, you can do what you suggest. There are some situations within the standard packages where panic/recover is used for handling errors. The official Go blog states:
For a real-world example of panic and recover, see the json package
from the Go standard library. It decodes JSON-encoded data with a set
of recursive functions. When malformed JSON is encountered, the parser
calls panic to unwind the stack to the top-level function call, which
recovers from the panic and returns an appropriate error value (see
the 'error' and 'unmarshal' methods of the decodeState type in
decode.go).
Some pointers:
Use error for your normal use cases. This should be your default.
If your code would get clearer and simpler by using a panic/recover (such as with a recursive call stack), then use it for that particular case.
Never let a package leak panics. Panics used within a package should be recovered within the package and returned as an error.
Recovering from a panic is stable. Don't worry about continuing execution after a recover. You can see such behavior in standard library such as with the net/http package which recovers from panics within handlers to prevent the entire http server to go crash when panicing on a single request.
Generally most methods won't panic, they will return an error instead, and there's a bit of an overhead of using defer.
So yes, it does work, but the "proper" / "go" way is to return an error instead of using panic / recover.

Resources