Having used C for decades I got into the habit of using the zero value of an enum as a special undefined/unknown/error value. Over the years I believe this has saved me not hours or even days but months of debugging time since it makes it obvious when a value has not been initialized. (I wouldn't do this for simple enums where there is a sensible default value and no possibility of uninitialized values.)
It seems to me that this practice is even more useful in Go as values are automatically zero-initialised for you. However, I have been told that "idiomatic" Go zero-values should be valid values. I think this "rule" was invented for structs, where it makes a lot of sense (in the absence of constructors) to have a newly created "zeroed" struct ready for use, but there are cases where there is no logical default value (for structs and enums).
If you need it here is an example:
type Base int
const (
Invalid Base = iota
A
C
T
G
)
Note that I have searched extensively for this question on SO and was surprised that this specific topic has not been covered. I realise that my question is somewhat subjective and may be flagged but I think it is useful. I am looking for evidence that using zero values to indicate error conditions is acceptable Go practice. Any examples of this use, eg. from the standard Go library, would be appreciated.
A true enum type should only be assigned a value from a list of pre-defined constant values. The go language, however, does not have such a type-value enforcement.
go has const which typically uses a derivative type of say int. There is no compile/run-time mechanism to enforce a value is strictly within a pre-defined list.
So what does this mean in practice?
Is your enum value mandatory or optional? That is, when deserializing the 'enum' value, is it:
optional - then use the zero-value signifies the default value
mandatory - then the zero-value indicates an initialization error
Depending on your common use-case, choose one of these two options.
EDIT:
Deserializing is not the only concern. One has to be careful when branching on enum values. For example:
type role int
const (
user role = iota
helpdesk
admin
)
func greet(r role) {
switch r {
case admin:
fmt.Println("hi admin")
case helpdesk:
fmt.Println("hi helpdesk")
default:
fmt.Println("hi user") // right?
}
}
This works:
var r role
r = admin
greet(r) // hi admin
But what about this?
r = 12
greet(r) // 'hi user' ?!!
So be sure to pedantically validate on valid values only:
func validateRole(r role) (err error) {
switch r {
case user, helpdesk, admin: // all valid values
default:
err = fmt.Errorf("invalid `role` enum %d", r)
}
return
}
Playground
Related
As mentioned in Go specification:
"A type determines a set of values together with operations and methods specific to those values."
To introduce an operation or method to be applied on the values of a type,
Is that operation applied on values (taken from a set) supposed to give the result (or value) from the same set?
For example, in the below code, findName() is not supposed to be a method on type user. Instead findName() should be a helper function.
type user struct {
name string
email string
age int
}
func (u user) findElder(other user) user {
if u.age >= other.age {
return u
}
return other
}
func (u user) findName() string {
return u.name
}
"operations and methods specific to those values" does not mean that they are unique to those values, or that they result in those values.
According to Google, "specific" means "clearly defined or identified." In this quote from the Go spec, the word "specific" is used with regard to the fact that Go is strongly typed, meaning that operations and methods work on the types that they are defined or identified to work on.
For example, the == operator is specified to work on integer types, thus, the == operator is specific to values of int, int32, uint8, etc.
No, I don't think that the operation applied on values (taken from a set) are supposed to give the result (or value), only from the same set. They can be from a different set of values as well. It all depends on the use case, the design of the type and the operation.
So in your case, findName() can very well be a method even though it is returning something not in the set of input values.
I'm coming from a Node.js background, and there a typical pattern is to have a function which takes an options object, i.e. an object where you set properties for optional parameters, such as:
foo({
bar: 23,
baz: 42
});
This is JavaScript's "equivalent" to optional and named parameters.
Now I have learnt that there are no optional parameters in Go, except variadic parameters, but they lack the readability of named parameters. So the usual pattern seems to be to hand over a struct.
OTOH a struct can not be defined with default values, so I need a function to set up the struct.
So I end up with:
Call a function that creates the struct and then fills it with default values.
Overwrite the values I would like to change.
Call the function I actually want to call and hand over the struct.
That's quite complicated and lengthy compared to JavaScript's solution.
Is this actually the idiomatic way of dealing with optional and named parameters in Go, or is there a simpler version?
Is there any way that you can take advantage of zero values? All data types get initialized to a zero value, so that is a form of default logic.
An options object is a pretty common idiom. The etcd client library has some examples (SetOptions,GetOptions,DeleteOptions) similar to the following.
type MyOptions struct {
Field1 int // zero value (default) of int is 0
Field2 string // zero value (default) of string is ""
}
func DoAction(arg1, arg2 string, options *MyOptions){
var defaultValue1 int = 30 // some reasonable default
var defaultValue2 string = "west" // some reasonable default
if options != nil {
defaultValue1 = options.Field1 // override with our values
defaultValue2 = options.Field2
}
doStuffWithValues
An relevant question (and very much in the mindset of Go) would be, do you need this kind of complexity? The flexibility is nice, but most things in the standard library try to only deal with 1 default piece of info/logic at a time to avoid this.
This question already has answers here:
Accessing struct fields inside a map value (without copying)
(2 answers)
Closed 7 years ago.
New to Go. Encountered this error and have had no luck finding the cause or the rationale for it:
If I create a struct, I can obviously assign and re-assign the values no problem:
type Person struct {
name string
age int
}
func main() {
x := Person{"Andy Capp", 98}
x.age = 99
fmt.Printf("age: %d\n", x.age)
}
but if the struct is one value in a map:
type Person struct {
name string
age int
}
type People map[string]Person
func main() {
p := make(People)
p["HM"] = Person{"Hank McNamara", 39}
p["HM"].age = p["HM"].age + 1
fmt.Printf("age: %d\n", p["HM"].age)
}
I get cannot assign to p["HM"].age. That's it, no other info. http://play.golang.org/p/VRlSItd4eP
I found a way around this - creating an incrementAge func on Person, which can be called and the result assigned to the map key, eg p["HM"] = p["HM"].incrementAge().
But, my question is, what is the reason for this "cannot assign" error, and why shouldn't I be allowed to assign the struct value directly?
p["HM"] isn't quite a regular addressable value: hashmaps can grow at runtime, and then their values get moved around in memory, and the old locations become outdated. If values in maps were treated as regular addressable values, those internals of the map implementation would get exposed.
So, instead, p["HM"] is a slightly different thing called a "map index expression" in the spec; if you search the spec for the phrase "index expression" you'll see you can do certain things with them, like read them, assign to them, and use them in increment/decrement expressions (for numeric types). But you can't do everything. They could have chosen to implement more special cases than they did, but I'm guessing they didn't just to keep things simple.
Your approach seems good here--you change it to a regular assignment, one of the specifically-allowed operations. Another approach (maybe good for larger structs you want to avoid copying around?) is to make the map value a regular old pointer that you can modify the underlying object through:
package main
import "fmt"
type Person struct {
name string
age int
}
type People map[string]*Person
func main() {
p := make(People)
p["HM"] = &Person{"Hank McNamara", 39}
p["HM"].age += 1
fmt.Printf("age: %d\n", p["HM"].age)
}
The left side of the assignment must b "addressable".
https://golang.org/ref/spec#Assignments
Each left-hand side operand must be addressable, a map index expression, or (for = assignments only) the blank identifier.
and https://golang.org/ref/spec#Address_operators
The operand must be addressable, that is, either a variable, pointer indirection, or slice indexing operation; or a field selector of an addressable struct operand; or an array indexing operation of an addressable array.
as #twotwotwo's comment, p["HM"] is not addressable.
but, there is no such definition show what is "addressable struct operand" in the spec. I think they should add some description for it.
Originally I figured I'd only use pointers for optional struct fields which could potentionally be nil in cases which it was initially built for.
As my code evolved I was writing different layers upon my models - for xml and json (un)marshalling. In these cases even the fields I thought would always be a requirement (Id, Name etc) actually turned out to be optional for some layers.
In the end I had put a * in front of all the fields including so int became *int, string became *string etc.
Now I'm wondering if I had been better of not generalising my code so much? I could have duplicated the code instead, which I find rather ugly - but perhaps more efficient than using pointers for all struct fields?
So my question is whether this is turning into an anti-pattern and just a bad habbit, or if this added flexibility does not come at a cost from a performance point of view?
Eg. can you come up with good arguments for sticking with option A:
type MyStruct struct {
Id int
Name string
ParentId *int
// etc.. only pointers where NULL columns in db might occur
}
over this option B:
type MyStruct struct {
Id *int
Name *string
ParentId *int
// etc... using *pointers for all fields
}
Would the best practice way of modelling your structs be from a purely database/column perspective, or eg if you had:
func (m *MyStruct) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
var v struct {
XMLName xml.Name `xml:"myStruct"`
Name string `xml:"name"`
Parent string `xml:"parent"`
Children []*MyStruct `xml:"children,omitempty"`
}
err := d.DecodeElement(&v, &start)
if err != nil {
return err
}
m.Id = nil // adding to db from xml, there's initially no Id, until after the insert
m.Name = v.Name // a parent might be referenced by name or alias
m.ParentId = nil // not by parentId, since it's not created yet, but maybe by nesting elements like you see above in the V struct (Children []*ContentType)
// etc..
return nil
}
This example could be part of the scenario where you want to add elements from XML to the database. Here ids would generally not make sense, so instead we use nesting and references on name or other aliases. An Id for the structs would not be set until we got the id, after the INSERT query. Then using that ID we could traverse down the hierachy to the child elements etc.
This would allow us to have just 1 MyStruct, and use eg. different POST http request handler functions, depending if the call came from form input, or xml importing where a nested hierarchy and different relations might need come different handling.
In the end I guess what I'm asking is:
Would you be better off separating struct models for db, xml- and json operations (or whatever scenario that you can think of), than using struct field pointers all the way, so we can reuse the model for different, yet related stuff?
Apart from possible performance (more pointers = more things for the GC to scan), safety (nil pointer dereference), convenience (s.a = 2 vs s.a = new(int); *s.a = 42), and memory penalties (a bool is one byte, a *bool is four to eight), there is one thing that really bothers me in the all-pointer approach. It violates the Single responsibility principle.
Is the MyStruct you get from XML or DB same as MyStruct? What if the DB schema will change? What if the XML changes format? What if you'll also need to unmarshal it into JSON, but in a slightly different manner? And what if you need to support all that (and in multiple versions!) at the same time?
A lot of pain comes to you when you try to make one thing do many things. Is having one do-it-all type instead of N specialised types really worth it?
Whats the correct way in go to distinguish between when a value in a struct was never set, or is just empty, for example, given the following:
type Organisation struct {
Category string
Code string
Name string
}
I need to know (for example) if the category was never set, or was saved as blank by the user, should I be doing this:
type Organisation struct {
Category *string
Code *string
Name *string
}
I also need to ensure I correctly persist either null or an empty string to the database
I'm still learning GO so it is entirely possible my question needs more info.
The zero value for a string is an empty string, and you can't distinguish between the two.
If you are using the database/sql package, and need to distinguish between NULL and empty strings, consider using the sql.NullString type. It is a simple struct that keeps track of the NULL state:
type NullString struct {
String string
Valid bool // Valid is true if String is not NULL
}
You can scan into this type and use it as a query parameter, and the package will handle the NULL state for you.
Google's protocol buffers (https://code.google.com/p/goprotobuf/) use pointers to describe optional fields.
The generated objects provide GetFoo methods which take the pain away from testing for nil (a.GetFoo() returns an empty string if a.Foo is nil, otherwise it returns *a.Foo).
It introduces a nuisance when you want to write literal structs (in tests, for example), because &"something" is not valid syntax to generate a pointer to a string, so you need a helper function (see, for example, the source code of the protocol buffer library for proto.String).
// String is a helper routine that allocates a new string value
// to store v and returns a pointer to it.
func String(v string) *string {
return &v
}
Overall, using pointers to represent optional fields is not without drawbacks, but it's certainly a viable design choice.
The standard database/sql package provides a NullString struct (members are just String string and Valid bool). To take care of some of the repetitive work of persistence, you could look at an object-relational manager like gorp.
I looked into whether there was some way to distinguish two kinds of empty string just out of curiosity, and couldn't find one. With []bytes, []byte{} == []byte(nil) currently returns false, but I'm not sure if the spec guarantees that to always remain true. In any case, it seems like the most practical thing to do is to go with the flow and use NullString.