defer file close on overridden variable - go

I'm in the process of learning Go, so I try to write an app that gets some data from a JSON API and put it into a file. I wrote a function to check whether my file exists and if not to create it.
func ensureFileExists(filePath string) {
f, err := os.Open(storageFile)
defer func() {
err := f.Close()
if err != nil {
fmt.Printf("fail to close file %q, error: %q", filePath, err)
return
}
fmt.Printf("file %q closed", filePath)
}()
if err != nil {
if os.IsNotExist(err) {
f, err = os.Create(storageFile)
if err != nil {
panic(err)
}
return
}
panic(err)
}
}
my considerations are:
Is this a correct way of doing it?
Is a bad thing to call defer f.Close() when the file was not opened (by Open, Create, ...), should I call it only after the error check?
It's kinda related to previous question, assumming that file from the os.Open call does not need to be closed because an error was returned there is no need to assign results of os.Create to new variable and close it separatly, correct?
What to do when f.Close() fails, is there anything more to do except put some logs or ignore it?

I would say there are multiple "wrongs" in your approach.
First, if a function is defined to return an error along with other values, you should almost always first check for error and only attempt to use the other values if there was no error.
A well-known (but rare) exception are methods of the io.Reader and io.Writer interface which may return a non-zero number of bytes read/written and a non-nil error.
While your deferred call does not use the value assigned to f immediately, if the call to os.Open fails—and hence returns a non-nil error, the value assigned to f is actually undefined. Well, Go is not C, and to actually have it operate on truly non-initialized memory one has to go for great lengths (and use unsafe) but the most important fact is that most of the functions with multiple return values one of which is error, do not document any state the rest of the values will be if the error is not nil.
In particular, os.File is free to return any value at all as its first return value when its second one is a non-nil error.
Well, careful programmers usually won't do stupid things, and os.Open actually returns nil as its first return value when its second, error, is not nil.
But think what will happen if your call to os.Open fails: the variable f gets assigned the value nil, and then the deferred call to the function literal closed over that variable will attempt to call Open on the nil value.
Again, some methods defined on pointer receivers know what to do when their receiver is nil, but Open is not one of them and it will just blow up trying to dereference a nil pointer.
Yes, you appear to "compensate" for that with the subsequent call to os.Create which is not allowed to fail through the use of panic but this merely creates convoluted code. I think you have come up with this solution in order to not write two defer blocks—one for the succeeded os.Open and another—for the succeeded os.Create, but if I were you I'd just wrote a simple "open or create" helper which would return the same values as os.Open or os.Create do. Beleive it or not, Go already has one—read on ;-)
So, the correct usage pattern most of the time is
f, err := os.Open(...)
if err != nil {
// Handle error
return ...
}
// At this point f is known to be in a good state
defer func() {
err := f.Close()
// ...
}()
Second, it's not needed to employ try-open-then-create-if-not-exists approach: the os.Open and os.Create can be seen as simplified interfaces for the generalized os.OpenFile (which maps quite closely to the open(2) call of POSIX.
With the O_CREATE flag that function will automatically create the file if it does not exist, and as a bonus, that happens atomically with regard of the check (while your approach has a natural race with the filesystem: between the attempt to open the file and an attempt to create it some other process may create it making the second call fail).
As to your last question, the answer is "it depends":
If a file was opened for reading and you have successfully read all the (required) data from the file, an error while closing it does not mean you have lost anything and is actually not likely to happen. Logging it as a warning and continuing is OK in most cases.
If a file was opened for writing, failure to closing it may mean you may have lost some part of what was written to that file before the call to Close.
A common example of the call to Close failing is a file residing on a networked filesystem (like NFS or CIFS).
Exactly what should be the strategy to employ highly depends on the nature of the process which performed that operation: say, if you're writing an e-mail server, failure to store a message should result in giving up and properly communicating the problem to the sending client; if you're writing an interactive application you might ask the user what to do and may be allow them to re-try or change the file's location and then re-try or whatever.

Related

Idiomatic way to handle logic error vs programming error in golang

I have been using golang to automate some deploy processes and I had to use exec package to call some bash scripts.
I used exec.Command("/home/rodrigo/my-deploy.sh").CombinedOutput() and I saw his implementation
func (c *Cmd) CombinedOutput() ([]byte, error) {
if c.Stdout != nil {
return nil, errors.New("exec: Stdout already set")
}
if c.Stderr != nil {
return nil, errors.New("exec: Stderr already set")
}
var b bytes.Buffer
c.Stdout = &b
c.Stderr = &b
err := c.Run()
return b.Bytes(), err
}
I realized you can't assign c.Stdout when using CombinedOutput() and I think that's ok but the way it is informed to the api caller is not correct.
CombinedOutput() return an error when you are using it in a bad way, so if you are going to use CombinedOutput() then you shouldn't assign c.Stderr or c.Stdout previously, if you do that then you are going to receive an error.
But this error is not because your script throw an error, it is because you are using the api wrong, in that case I believe you should get a panic because a bad api usage should not being handled (I think).
I come from Java World and when you are using some method in the wrong way then you receive an RuntimeException, for example.
public void run(Job job) throws NotCompletedJob {
if (job.getId() != null) {
throw new IllegalArgumentException("This job should not have id");
}
job.setId(calculateId());
job.run();
}
With this signature I can know I'm wrong calling run(obj); with a Job that has an id and in fact I can distinguish if there is an error with my script or I'm using api in a wrong way.
NotCompletedJob is a checked exception so I must handle it but IllegalArgumentException is not so I could get it anytime. Catching IllegalArgumentException or any other RuntimeException is not always considered a good practice because they are indicating you have an error from programmers's point of view and it is not a possible expected error like NotCompletedJob.
Having said that, How can I differentiate between a programming error (bad api usages for example) from an expected error (script doesn't finished ok) with current CombinedOutput() implementation ?
To clarify my concern, I'm not saying that is wrong the current implementation of CombinedOuput, but I don't understand how the caller could distinguish if it is an error of the command being executed or an error caused from his bad api usage.
I believe that a best approach would be to panicking when the caller is using the api in a wrong way, as the same case when the caller is passing a nil reference to a function that expect a non nil reference (in fact this is the current behaviour).
I come from Java World and when you are using some method in the wrong
way then you receive an RuntimeException.
You are in the Go world now. Therefore, that argument is invalid. Abandon Java.
The Go Programming Language Specification
Handling panics
Two built-in functions, panic and recover, assist in reporting and
handling run-time panics and program-defined error conditions.
func panic(interface{})
func recover() interface{}
While executing a function F, an explicit call to panic or a run-time
panic terminates the execution of F. Any functions deferred by F are
then executed as usual. Next, any deferred functions run by F's caller
are run, and so on up to any deferred by the top-level function in the
executing goroutine. At that point, the program is terminated and the
error condition is reported, including the value of the argument to
panic. This termination sequence is called panicking.
The Go Blog
Defer, Panic, and Recover
The convention in the Go libraries is that even when a package uses
panic internally, its external API still presents explicit error
return values.
Go Code Review Comments
This page collects common comments made during reviews of Go code, so
that a single detailed explanation can be referred to by shorthands.
This is a laundry list of common mistakes, not a style guide.
Don't Panic
See https://golang.org/doc/effective_go.html#errors. Don't use panic
for normal error handling. Use error and multiple return values.
Effective Go
Errors
The usual way to report an error to a caller is to return an error as
an extra return value.
Your Go server program is concurrently handling 100,000 clients. If an error occurs, report and handle it; always check for errors. DON'T crash all 100,000 clients with a panic. Go packages should not panic.
Read the Go documentation and the Go standard library code.

Should I use panic or return error?

Go provides two ways of handling errors, but I'm not sure which one to use.
Assuming I'm implementing a classic ForEach function which accepts a slice or a map as an argument. To check whether an iterable is passed in, I could do:
func ForEach(iterable interface{}, f interface{}) {
if isNotIterable(iterable) {
panic("Should pass in a slice or map!")
}
}
or
func ForEach(iterable interface{}, f interface{}) error {
if isNotIterable(iterable) {
return fmt.Errorf("Should pass in a slice or map!")
}
}
I saw some discussions saying panic() should be avoided, but people also say that if program cannot recover from error, you should panic().
Which one should I use? And what's the main principle for picking the right one?
You should assume that a panic will be immediately fatal, for the entire program, or at the very least for the current goroutine. Ask yourself "when this happens, should the application immediately crash?" If yes, use a panic; otherwise, use an error.
Use panic.
Because your use case is to catch a bad use of your API. This should never happen at runtime if the program is calling your API properly.
In fact, any program calling your API with correct arguments will behave in the same way if the test is removed. The test is there only to fail early with an error message helpful to the programmer that did the mistake. Ideally, the panic might be reached once during development when running the testsuite and the programmer would fix the call even before committing the bad code, and that incorrect use would never reach production.
See also this reponse to question Is function parameter validation using errors a good pattern in Go?.
I like the way it's done in some libraries where on top of a regular method DoSomething, its "panicky" version is added with MustDoSomething. I'm relatively new to go, but I've already seen it in several places, notably sqlx.
In general, if you want to expose your code to someone else, you should either have Must- and a regular version of the method, or your methods/functions should give the client a chance to recover the way they want and so error should be available to them in a go-idiomatic way.
Having said that, I agree that if your API/library is used inappropriately, it's Ok to panic as well. As a matter of fact, I've also seen methods like MustGetenv() that will panic if a critical env.var is missing. Fail-fast mechanism basically.
If some mandatory requirement is not provided or not there while starting the service (eg. database connection, some service configuration which is required) then you should use panic.
There should be return error for any user response or server side error.
Ask yourself these questions:
Do you expect the exceptional situation to occur, regardless how well would you code your app? Do you think it should be useful to make the user aware of such condition as part of the normal usage of your app? Handle it as an error, because it concerns the application as working normally.
Should that exceptional situation NOT occur if you code appropriately (and somewhat defensively)? (example: dividing by zero, or accessing an array element out of bounds) Is your app totally clueless under that error? Panic.
Do you have your API and want to ensure users use it appropriately? Panic. Your API will seldom recover if used incorrectly.
Use error whenever possible
Only use panic when your code could end up in a bad state that would be prone to crashing; something truly unexpected. The example above with ForEach() is an exported func that accepts an interface so it should expect someone will improperly call it. And if it is improperly called, you know why you cannot continue and you know how to handle that error. isNotIterable is literally binary and easy to control.
But error is not like a try/catch
Even if you try to justify panic/recover by looking at throw/catch from other languages, you still use errors. We know you are trying the function because you are calling it, we know there was an error because err != nil, and just like checking the type of exception thrown you can check the type of error returned with errors.Is(err, ErrNotIterable)
So should you use panic for errors in concurrency?
The answer is still most likely no. Errors are still the preferred way in Go and you can use a wait group to shut down the goroutines:
ctx, cancel := context.WithTimeout(context.Background(), time.Minute*5)
// automatically cancel in 5 min
defer cancel()
errGroup, ctx := errgroup.WithContext(ctx)
errGroup.Go(func() error {
// do crazy stuff; you can still check for errors
if ... {
return fmt.Errorf("critical error, stopping all goroutines now")
}
// code completed without issues
return nil
})
err = errGroup.Wait()
Even using the structure of the original example, you still have better control with errors than panics:
func ForEach(iterable interface{}, f interface{}) error {
if isNotIterable(iterable) {
return fmt.Errorf("expected something iterable but got %v", reflect.ValueOf(iterable).String())
}
switch v.Kind() {
case reflect.Map:
...
case reflect.Array, reflect.Slice:
...
default:
return fmt.Errorf("isNotIterable is false but I do not know how to iterate through %v", reflect.ValueOf(iterable).String())
}
But error feels very verbose
Yes, that is the point. When an error is returned, it is at that point to do something about it. You are giving the calling code options rather than making the decision to start shutting down and killing the application unless you recover(). If you are just returning the same error all the way up the call stack then error will seem inferior to panic, but this is due to not addressing issues when they happen.
So when to use panic?
When your code is on a collision course to crash and you cannot assume your way out of it. Another is when the code assumes something that is no longer true and having to check the integrity in every function from here on out would be tedious (and might impact performance). Still, you would use panic() only to get out of the layers of uncertainty... then still handle errors:
func ForEach(iterable interface{}, f interface{}) error {
defer func() {
if r := recover(); r != nil {
err = fmt.Errorf("cannot iterate due to unexpected runtime error %v", r)
return
}
}()
...
// perhaps a broken pipe in a global var
// or an included module threw a panic at you!
}
But if you are still not convinced... Here is the Go FAQ
We believe that coupling exceptions to a control structure, as in the try-catch-finally idiom, results in convoluted code. It also tends to encourage programmers to label too many ordinary errors, such as failing to open a file, as exceptional.
Go takes a different approach. For plain error handling, Go's multi-value returns make it easy to report an error without overloading the return value. A canonical error type, coupled with Go's other features, makes error handling pleasant but quite different from that in other languages.
A panic typically means something went unexpectedly wrong. Mostly used to fail fast on errors that shouldn’t occur during normal operation, or that we aren’t prepared to handle gracefully. So in this case just return the error, you don't want your program to panic.
I think none of the previous answers are correct:
By default, if we don't know what to do with the "error" code must panic following best programming patterns:
https://en.wikipedia.org/wiki/Fail-fast
Putting it more formally, our "Turing Machine" is broken and we need to come back to an "stable state" or "reset state". More info at https://en.wikipedia.org/wiki/Reset_(computing)
For example in web (micro)services that means returning a 40X error (panic caused by input from user) or 50X error (panic caused by something else - hardware, network, assert error, ...)
If we know what to do with the "error", then we do not have an error in first place, but an uncomfortable return value. This is a normal execution condition and probably not an error. Normally this correspond to the happy vs non-happy path modeling.
In a summary, the err return value is mostly a wrong idea, even if the GO community has adopted it as a religion. Using error return values is just a patchy way to speed up program execution since it require fewer CPU instructions to be implemented, but most of the time, except for low-level services, it is useless and promote dirty code. (note that GO was designed to implement those low-level services as an "easy-C", but it was adopted for high-level (Level 7) application programs when an error must fail fast to avoid continuing with undefined states that can potentially cause money being lost of fatal casualties. In case of doubt, default to panic.
Don't use panic for normal error handling. Use error and multiple return values. See https://golang.org/doc/effective_go.html#errors.

How to ensure concurrency in Golang gorilla WebSocket package

I have studied the Godoc of the gorilla/websocket package.
In the Godoc it is clearly stated that
Concurrency
Connections support one concurrent reader and one concurrent writer.
Applications are responsible for ensuring that no more than one goroutine calls the write methods (NextWriter, SetWriteDeadline, WriteMessage, WriteJSON, EnableWriteCompression, SetCompressionLevel) concurrently and that no more than one goroutine calls the read methods (NextReader, SetReadDeadline, ReadMessage, ReadJSON, SetPongHandler, SetPingHandler) concurrently.
The Close and WriteControl methods can be called concurrently with all other
methods.
However, in one of the example provided by the package
func (c *Conn) readPump() {
defer func() {
hub.unregister <- c
c.ws.Close()
}()
c.ws.SetReadLimit(maxMessageSize)
c.ws.SetReadDeadline(time.Now().Add(pongWait))
c.ws.SetPongHandler(func(string) error {
c.ws.SetReadDeadline(time.Now().Add(pongWait)); return nil
})
for {
_, message, err := c.ws.ReadMessage()
if err != nil {
if websocket.IsUnexpectedCloseError(err, websocket.CloseGoingAway) {
log.Printf("error: %v", err)
}
break
}
message = bytes.TrimSpace(bytes.Replace(message, newline, space, -1))
hub.broadcast <- message
}
}
Source: https://github.com/gorilla/websocket/blob/a68708917c6a4f06314ab4e52493cc61359c9d42/examples/chat/conn.go#L50
This line
c.ws.SetPongHandler(func(string) error {
c.ws.SetReadDeadline(time.Now().Add(pongWait)); return nil
})
and this line
_, message, err := c.ws.ReadMessage()
seems to be not synchronized because the first line is a callback function so it should be invoked in a Goroutine created in the package and the second line is executing in the Goroutine that invoke serveWs
More importantly, how should I ensure that no more than one goroutine calls the SetReadDeadline, ReadMessage, SetPongHandler, SetPingHandler concurrently?
I tries to use a Mutex lock and lock it whenever I call the above functions, and unlock it afterwards, but quickly I realize a problem. It is usual (also in the example) that ReadMessage is being called in a for-loop. But if the Mutext is locked before the ReadMessage, then no other Read-functions can acquire the lock and execute until next message is received
Is there any better way in handling this concurrency issue? Thanks in advance.
The best way to ensure that there are no concurrent calls to the read methods is to execute all of the read methods from a single goroutine.
All of the Gorilla websocket examples use this approach including the example pasted in the question. In the example, all calls to the read methods are from the readPump method. The readPump method is called once for a connection on a single goroutine. It follows that the connection read methods are not called concurrently.
The section of the documentation on control messages says that the application must read the connection to process control messages. Based on this and Gorilla's own examples, I think it's safe to assume that the ping, pong and close handlers will be called from the application's reading goroutine as it is in the current implementation. It would be nice if the documentation could be more explicit about this. Maybe file an issue?

Why does Go panic() take interface{} instead of ...interface{} as argument?

I noticed panic takes interface{} as an argument, while fmt.Print and the likes take ...interface{}. Wouldn't it be more convenient if panic took ...interface{} as well?
Why did the Go authors define panic as func panic(v interface{}) rather than func panic(v ...interface{}) (like they did with fmt)?
panic didn't take just one argument at first.
You can trace back the one argument implementation back to 30th March 2010:
commit 01eaf78 gc: add panic and recover (still unimplemented in runtime)
main semantic change is to enforce single argument to panic.
The spec is fixed in commit 5bb29fb, and commit 00f9f0c illustrates how panic could took before multiple arguments:
src/pkg/bufio/bufio.go
b, err := NewWriterSize(wr, defaultBufSize)
if err != nil {
// cannot happen - defaultBufSize is valid size
- panic("bufio: NewWriter: ", err.String())
+ panic(err)
}
That follows a proposal from March 25th, 2010:
We don't want to encourage the conflation of errors and exceptions that occur in languages such as Java.
Instead, this proposal uses a slight change to the definition of defer and a couple of runtime functions to provide a clean mechanism for handling truly exceptional conditions.
During panicking, if a deferred function invocation calls recover, recover returns the value passed to panic and stops the panicking.
At any other time, or inside functions called by the deferred call, recover returns nil.
After stopping the panicking, a deferred call can panic with a new argument, or the same one, to continue panicking.
Alternately, the deferred call might edit the return values for its outer function, perhaps returning an error.
In those various scenario, dealing with one value to pass around seems easier than dealing with an variable number of arguments (especially when it comes to implement recover in C).
Because the value you pass to panic is a value you want to panic with and can be retrieved using recover. Having multiple panic values doesn't really make sense.
package main
import "fmt"
func main() {
defer func() {
if v := recover(); v != nil {
fmt.Println(v.(int))
}
}()
panic(3)
}
Example on using recover values:
For a real-world example of panic and recover, see the json package
from the Go standard library. It decodes JSON-encoded data with a set
of recursive functions. When malformed JSON is encountered, the parser
calls panic to unwind the stack to the top-level function call, which
recovers from the panic and returns an appropriate error value (see
the 'error' and 'unmarshal' methods of the decodeState type in
decode.go).
Source: http://blog.golang.org/defer-panic-and-recover

is it wrong to treat panic / recover as throw / catch

Speaking as a new go enthusiast trying to work with the go way of error handling. To be clear - I like exceptions.
I have a server that accepts a connection , processes a set of requests and replies to them. I found that I can do
if err != nil{
panic(err)
}
in the deep down processing code
and have
defer func() {
if err := recover(); err != nil {
log.Printf("%s: %s", err, debug.Stack()) // line 20
}
}()
in the client connection code (each connection is in a goroutine). This nicely wraps everything up, forcefully closes the connection (other defers fire) and my server continues to hum along.
But this feels an awful lot like a throw/catch scenario - which golang states it doesn't support. Questions
is this stable. ie recovering a panic is an OK thing to do as an
ongoing way of life. Its not intended to just slightly defer an
immediate shutdown
I looked for a discussion on this topic and did not find it anywhere - any pointers?
I feel that the answer is 'yes it works' and can be used inside you own code, but panic should NOT be used by a library intended for wider use. The standard and polite way for a library to behave is by error returns
Yes, you can do what you suggest. There are some situations within the standard packages where panic/recover is used for handling errors. The official Go blog states:
For a real-world example of panic and recover, see the json package
from the Go standard library. It decodes JSON-encoded data with a set
of recursive functions. When malformed JSON is encountered, the parser
calls panic to unwind the stack to the top-level function call, which
recovers from the panic and returns an appropriate error value (see
the 'error' and 'unmarshal' methods of the decodeState type in
decode.go).
Some pointers:
Use error for your normal use cases. This should be your default.
If your code would get clearer and simpler by using a panic/recover (such as with a recursive call stack), then use it for that particular case.
Never let a package leak panics. Panics used within a package should be recovered within the package and returned as an error.
Recovering from a panic is stable. Don't worry about continuing execution after a recover. You can see such behavior in standard library such as with the net/http package which recovers from panics within handlers to prevent the entire http server to go crash when panicing on a single request.
Generally most methods won't panic, they will return an error instead, and there's a bit of an overhead of using defer.
So yes, it does work, but the "proper" / "go" way is to return an error instead of using panic / recover.

Resources