Prevent missing fields in struct initialization - go

Consider this example. Let's say I have this object which is ubiquitous throughout my codebase:
type Person struct {
Name string
Age int
[some other fields]
}
Somewhere deep in the codebase, I also have some code that creates a new Person struct. Maybe it's something like the following utility function (note that this is just an example of some function that creates a Person-- the point of my question is not to ask about the copy function specifically):
func copyPerson(origPerson Person) *Person {
copy := Person{
Name: origPerson.Name,
Age: origPerson.Age,
[some other fields]
}
return &copy
}
Another developer comes along and adds a new field Gender to the Person struct. However, because the copyPerson function is in a distant piece of code they forget to update copyPerson. Since golang doesn't throw any warning or error if you omit a parameter when creating a struct, the code will compile and appear to work fine; the only difference is that the copyPerson method will now fail to copy over the Gender struct, and the result of copyPerson will have Gender replaced with a nil value (e.g. the empty string).
What is the best way to prevent this from happening? Is there a way to ask golang to enforce no missing parameters in a specific struct initialization? Is there a linter that can detect this type of potential error?

The way I would solve this is to just use NewPerson(params) and not export the person. This makes it so the only way to get a person instance is to go through your New method.
package person
// Struct is not exported
type person struct {
Name string
Age int
Gender bool
}
// We are forced to call the constructor to get an instance of person
func New(name string, age int, gender bool) person {
return person{name, age, gender}
}
This forces everyone to get an instance from the same place. When you add a field, you can add it to the function definition and then you get compile time errors anywhere they are constructing a new instance, so you can easily find them and fix them.

First of all, your copyPerson() function does not live up to its name. It copies some fields of a Person, but not (necessarily) all. It should've been named copySomeFieldsOfPerson().
To copy a complete struct value, just assign the struct value. If you have a function receiving a non-pointer Person, that is already a copy, so just return its address:
func copyPerson(p Person) *Person {
return &p
}
That's all, this will copy all present and future fields of Person.
Now there may be cases where fields are pointers or header-like values (like a slice) which should be "detached" from the original field (more precisely from the pointed object), in which case you do need to make manual adjustments, e.g.
type Person struct {
Name string
Age int
Data []byte
}
func copyPerson(p Person) *Person {
p2 := p
p2.Data = append(p2.Data, p.Data...)
return &p2
}
Or an alternative solution which does not make another copy of p but still detaches Person.Data:
func copyPerson(p Person) *Person {
var data []byte
p.Data = append(data, p.Data...)
return &p
}
Of course, if someone adds a field which also needs manual handling, this won't help you out.
You could also use unkeyed literal, like this:
func copyPerson(p Person) *Person {
return &Person{
p.Name,
p.Age,
}
}
This will result in a compile-time error if someone adds a new field to Person, because an unkeyed composite struct literal must list all fields. Again, this will not help you out if someone changes the fields where the new fields are assignable to the old ones (e.g. someone swaps 2 fields next to each other having the same type), also unkeyed literals are discouraged.
Best would be for the package owner to provide a copy constructor, next to the Person type definition. So if someone changes Person, he / she should be responsible keeping CopyPerson() still operational. And as others mentioned, you should already have unit tests which should fail if CopyPerson() does not live up to its name.
The best viable option?
If you can't place the CopyPerson() next to the Person type and have its author maintain it, go ahead with the struct value copying and manual handling of pointer and header-like fields.
And you can create a person2 type which is a "snapshot" of the Person type. Use a blank global variable to receive compile-time alert if the original Person type changes, in which case copyPerson()'s containing source file will refuse to compile, so you'll know it needs adjusting.
This is how it can be done:
type person2 struct {
Name string
Age int
}
var _ = Person(person2{})
The blank variable declaration will not compile if fields of Person and person2 do not match.
A variation of the above compile-time check could be to use typed-nil pointers:
var _ = (*Person)((*person2)(nil))

I'm not aware of a language rule that enforces that.
But you can write custom checkers for Go vet if you'd like. Here's a recent post talking about that.
That said, I would reconsider the design here. If the Person struct is so important in your code base, centralize its creation and copying so that "distant places" don't just create and move Persons around. Refactor your code so that only a single constructor is used to build Persons (maybe something like person.New returning a person.Person), and then you'll be able to centrally control how its fields are initialized.

The idiomatic way would be to not do this at all, and instead make the zero value useful. The example of a copy function doesn't really make sense because it's totally unnecessary - you could just say:
copy := new(Person)
*copy = *origPerson
and not need a dedicated function nor have to keep a listing of fields up to date. If you want a constructor for new instances like NewPerson, just write one and use it as a matter of course. Linters are great for some things but nothing beats well-understood best practices and peer code review.

The best solution I have been able to come up with (and it's not very good) is to define a new struct tempPerson identical to the Person struct and put it nearby to any code which initializes a new Person struct, and to change the code that initializes a Person so that it instead initializes it as a tempPerson but then casts it to a Person. Like this:
type tempPerson struct {
Name string
Age int
[some other fields]
}
func copyPerson(origPerson Person) *Person {
tempCopy := tempPerson{
Name: orig.Name,
Age: orig.Age,
[some other fields]
}
copy := (Person)(tempCopy)
return &copy
}
This way if another field Gender is added to Person but not to tempPerson the code will fail at compile-time. Presumably the developer would then see the error, edit tempPerson to match their change to Person, and in doing so notice the nearby code which uses tempPerson and recognize that they should edit that code to also handle the Gender field as well.
I don't love this solution because it involves copying and pasting the struct definition everywhere that we initialize a Person struct and would like to have this safety. Is there any better way?

Approach 1 Add something like copy constructor:
type Person struct {
Name string
Age int
}
func CopyPerson(name string, age int)(*Person, error){
// check params passed if needed
return &Person{Name: name, Age: age}, nil
}
p := CopyPerson(p1.Name, p1.age) // force all fields to be passed
Approach 2: (not sure if this is possible)
Can this be covered in tests say using reflection?
If we compare the number of fields initialised(initialise all the field with values different than the default values) in the original struct and the fields in copy returned by the copy function.

Here is how i would do it:
func copyPerson(origPerson Person) *Person {
newPerson := origPerson
//proof that 'newPerson' points to a new person object
newPerson.name = "new name"
return &newPerson
}
Go Playground

Related

Polymorphism on structs without methods in Go

I'm working on several web server projects in Go, and there is a common problem that I'm always facing. I know we can achieve something like polymorphism in Go with interfaces and methods, but many times I had a scenario that I needed polymorphism on some data-holder structs that (maybe) just had some common fields, and no methods at all.
For example consider a story writing platform, where each user can write short stories and novels:
type ShortStory struct {
Name string
ID int
Body string
}
type LongStory struct {
Name string
ID int
Chapters []string
}
Now I simply want to have a data layer function, say GetStories(), which fetches all stories written by a user from database.
func GetStories(id int) []SOME_TYPE {
...
}
There are really no methods that I want to have on my ShortStory and LongStory structs. I know I can add a dummy method and let them satisfy some Storier interface, then use that interface as return type. But since there is no method I would want on a data container model, adding a dummy method just for the language to enable a feature, seems like a poor design choice to me.
I can also make the function return []interface{}, but that's against the whole idea of "typed language" I believe.
Another way is to have two separate GetShortStories() and GetLongStories() methods, which return a slice of their own type. But at some point I would finally want to merge those two slices into one and there I would again need a []interface{}. Yes, I can return a JSON like:
{
"short_stories" : [...],
"long_stories" : [...]
}
But I want my json to be like:
[{...}, {...}, {...}]
And I wouldn't change my APIs because of a language's limits!
I'm not a pro in Go, so am I missing something here? Is there a Go-ish approach to this, or is it really bad language design on Golang's side?
If you cannot express what you want to do using the features of a language, you should first try to change the way you structure your program before blaming the language itself. There are concepts that cannot be expressed in Go but can be expressed well in other languages, and there are concepts you cannot express well in other languages but you can in Go. Change the way you solve the problem to effectively use the language.
One way you can address your problem is using a different type of struct:
type Story struct {
Name string
ID int
ShortBody string
Chapters []string
}
If the Chapters is empty, then it is a short story.
Another way:
type Story struct {
Name string
ID int
Content StoryContent
}
type StoryContent interface {
Type() string
}
type ShortStory interface {
StoryContent
Body() string
}
type LongStory interface {
StoryContent
Chapters() []string
}
etc.

Copy common fields between structs of different types

I have two structs, whose types are as follows:
type UserStruct struct {
UserID string `bson:"user_id" json:"user_id"`
Address string `bson:"address" json:"address"`
Email string `bson:"email" json:"email"`
CreatedAt time.Time `bson:"created_at" json:"created_at"`
PhoneNumber string `bson:"phone_number" json:"phone_number"`
PanCard string `bson:"pancard" json:"pancard"`
Details map[string]string `json:"details"`
}
type SecretsStruct struct {
UserID string `r:"user_id" json:"user_id"`
Secrets []string `r:"secrets" json:secrets`
Address string `r:"address" json:"address"`
Email string `r:"email"json:"email"`
CreatedAt time.Time `r:"created_at"json:"created_at"`
PhoneNumber string `r:"phone_number" json:"phone_number"`
PanCard string `r:"pancard" json:"pancard"`
}
I already have an instance of UserStruct. I want to copy the fields common to both structs from UserStruct to a new instance of SecretStruct, without using reflection.
Go is a statically typed language (and is not Python). If you want to copy fields between the structs, you must either cause code to be supplied at compile time which knows how to do this, or use the reflect library to perform the operation at runtime.
Note that I said "cause code to be supplied at compile time" because you don't have to explicitly write that code. You could use code generation to produce the copy code from the struct definitions, or from a higher-level definition (e.g. XML) which generates both the struct definition and the copying code.
However, good Go programmers prefer clear code over clever solutions. If this is a single localized requirement, writing a code generator to avoid "boilerplate" code is almost certainly overkill; its implementation will take longer than the code to copy the structs, and the associated complexity will introduce a risk of more bugs. Similarly, reflect-based solutions are complicated, not clear, and only recommended in cases where you require a generic or extensible solution, and where this cannot be fulfilled at compile time.
I recommend simply write the copying code, and add appropriate comments to the struct definitions and copy methods to ensure future maintainers are aware of their obligation to maintain the copy methods.
Example
// Define your types - bodies elided for brevity
// NOTE TO MAINTAINERS: if editing the fields in these structs, ensure
// the methods defined in source file <filename>.go are updated to
// ensure common fields are copied between structs on instantiation.
type UserStruct struct { ... }
type SecretStruct struct { ... }
// NewSecretStructFromUserStruct populates and returns a SecretStruct
// from the elements common to the two types. This method must be
// updated if the set of fields common to both structs is changed in
// future.
func NewSecretStructFromUserStruct(us *UserStruct) *SecretStruct {
// You should take care to deep copy where necessary,
// e.g. for any maps shared between the structs (not
// currently the case).
ss := new(SecretStruct)
ss.UserID = us.UserID
ss.Address = us.Address
ss.Email = us.Email
ss.CreatedAt = us.CreatedAt
ss.PhoneNumber = us.PhoneNumber
ss.PanCard = us.PanCard
return ss
}
// You may also consider this function to be better suited as
// a receiver method on UserStruct.

Access all fields from parent method

I'm developing an application where data is stored in mongodb. There are several collections and of course all of them have some common fields (like Id, creation date, etc) and methods (for example Insert). In my vision, I need to create base model struct with needed fields and methods, and then embed this struct into my models. Unfortunately, this doesn't work because method defined for base model doesn't see child fields.
I don't know how to explain further. Here is code in playground:
https://play.golang.org/p/_x-B78g4TV
It uses json instead of mgo, but idea is still the same.
I want the output to be:
Saving to 'my_model_collection'
{"_id":42, "foo": "Some value for foo", "bar": "Here we set some value for bar"}
Not:
Saving to 'my_model_collection'
{"_id":42}
Writing that insert method for each my model seems to be against DRY, so what is correct/idiomatic way to achieve this in Go?
This is not possible, for details see my answer: Can embedded struct method have knowledge of parent/child?
You may do 2 things:
1. Abandon method and make it a helper / utility function
The idea is to make Insert() detached from BaseModel and make it a simple function, and you pass the document to it which you want to save.
I personally prefer this option, as it requires less hassle and maintenance. It could look like this:
func Insert(doc interface{}) {
j, _ := json.Marshal(doc)
fmt.Println(string(j))
}
You also had a "typo" in the tags:
type MyModel struct {
*BaseModel
Foo string `json:"foo"`
Bar string `json:"bar"`
}
Using it:
Insert(m)
Output (try it on the Go Playground):
{"_id":42,"foo":"Some value for foo","bar":"Here we set some value for bar"}
2. Pass the (pointer to) the wrapper to the BaseModel
In this approach, you have to pass a pointer to the embedder struct so the BaseModel.Insert() method will have a pointer to it, and may use that to save / marshal. This is basically manually maintaining a "reference" to the struct that embeds us and is being saved/marshalled.
This is how it could look like:
type BaseModel struct {
Id int `json:"_id"`
collectionName string
wrapper interface{}
}
And then in the Insert() method save the wrapper:
func (m *BaseModel) Insert() {
fmt.Printf("Saving to '%v'\n", m.collectionName)
j, _ := json.Marshal(m.wrapper)
fmt.Println(string(j))
}
Creation is slightly more complex:
func NewMyModel() *MyModel {
mm := &MyModel{
Foo: "Some value for foo",
}
mm.BaseModel = NewBaseModel("my_model_collection", mm)
return mm
}
But output is as you wish:
Saving to 'my_model_collection'
{"_id":42,"foo":"Some value for foo","bar":"Here we set some value for bar"}
Try it on the Go Playground.
In Golang, you can't override a parent method, because that's not how polymorphism works. The Insert method will apply on the BaseModel member, and not on MyModel.
Also, you're trying to use mgo in an improper way. If you want to insert documents in collections, then you already have an Insert method for a Collection struct which works on interface{} types (same as json.Marshal).
Of course, you can have a BaseModel that will contain fields shared by all of your models. In fact, GORM uses a similar approach and provides a Model struct to be included in every child model.
Well known problem ;o) Member variables (like collectionName) which name starts with lower letter are not visible from other packages (like json). Therefore change struct to:
type BaseModel struct {
Id int `json:"_id"`
CollectionName string `json:"collectionName"`
}
and world will be better place to live in.

Any down-side always using pointers for struct field types?

Originally I figured I'd only use pointers for optional struct fields which could potentionally be nil in cases which it was initially built for.
As my code evolved I was writing different layers upon my models - for xml and json (un)marshalling. In these cases even the fields I thought would always be a requirement (Id, Name etc) actually turned out to be optional for some layers.
In the end I had put a * in front of all the fields including so int became *int, string became *string etc.
Now I'm wondering if I had been better of not generalising my code so much? I could have duplicated the code instead, which I find rather ugly - but perhaps more efficient than using pointers for all struct fields?
So my question is whether this is turning into an anti-pattern and just a bad habbit, or if this added flexibility does not come at a cost from a performance point of view?
Eg. can you come up with good arguments for sticking with option A:
type MyStruct struct {
Id int
Name string
ParentId *int
// etc.. only pointers where NULL columns in db might occur
}
over this option B:
type MyStruct struct {
Id *int
Name *string
ParentId *int
// etc... using *pointers for all fields
}
Would the best practice way of modelling your structs be from a purely database/column perspective, or eg if you had:
func (m *MyStruct) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
var v struct {
XMLName xml.Name `xml:"myStruct"`
Name string `xml:"name"`
Parent string `xml:"parent"`
Children []*MyStruct `xml:"children,omitempty"`
}
err := d.DecodeElement(&v, &start)
if err != nil {
return err
}
m.Id = nil // adding to db from xml, there's initially no Id, until after the insert
m.Name = v.Name // a parent might be referenced by name or alias
m.ParentId = nil // not by parentId, since it's not created yet, but maybe by nesting elements like you see above in the V struct (Children []*ContentType)
// etc..
return nil
}
This example could be part of the scenario where you want to add elements from XML to the database. Here ids would generally not make sense, so instead we use nesting and references on name or other aliases. An Id for the structs would not be set until we got the id, after the INSERT query. Then using that ID we could traverse down the hierachy to the child elements etc.
This would allow us to have just 1 MyStruct, and use eg. different POST http request handler functions, depending if the call came from form input, or xml importing where a nested hierarchy and different relations might need come different handling.
In the end I guess what I'm asking is:
Would you be better off separating struct models for db, xml- and json operations (or whatever scenario that you can think of), than using struct field pointers all the way, so we can reuse the model for different, yet related stuff?
Apart from possible performance (more pointers = more things for the GC to scan), safety (nil pointer dereference), convenience (s.a = 2 vs s.a = new(int); *s.a = 42), and memory penalties (a bool is one byte, a *bool is four to eight), there is one thing that really bothers me in the all-pointer approach. It violates the Single responsibility principle.
Is the MyStruct you get from XML or DB same as MyStruct? What if the DB schema will change? What if the XML changes format? What if you'll also need to unmarshal it into JSON, but in a slightly different manner? And what if you need to support all that (and in multiple versions!) at the same time?
A lot of pain comes to you when you try to make one thing do many things. Is having one do-it-all type instead of N specialised types really worth it?

Properly distinguish between not set (nil) and blank/empty value

Whats the correct way in go to distinguish between when a value in a struct was never set, or is just empty, for example, given the following:
type Organisation struct {
Category string
Code string
Name string
}
I need to know (for example) if the category was never set, or was saved as blank by the user, should I be doing this:
type Organisation struct {
Category *string
Code *string
Name *string
}
I also need to ensure I correctly persist either null or an empty string to the database
I'm still learning GO so it is entirely possible my question needs more info.
The zero value for a string is an empty string, and you can't distinguish between the two.
If you are using the database/sql package, and need to distinguish between NULL and empty strings, consider using the sql.NullString type. It is a simple struct that keeps track of the NULL state:
type NullString struct {
String string
Valid bool // Valid is true if String is not NULL
}
You can scan into this type and use it as a query parameter, and the package will handle the NULL state for you.
Google's protocol buffers (https://code.google.com/p/goprotobuf/) use pointers to describe optional fields.
The generated objects provide GetFoo methods which take the pain away from testing for nil (a.GetFoo() returns an empty string if a.Foo is nil, otherwise it returns *a.Foo).
It introduces a nuisance when you want to write literal structs (in tests, for example), because &"something" is not valid syntax to generate a pointer to a string, so you need a helper function (see, for example, the source code of the protocol buffer library for proto.String).
// String is a helper routine that allocates a new string value
// to store v and returns a pointer to it.
func String(v string) *string {
return &v
}
Overall, using pointers to represent optional fields is not without drawbacks, but it's certainly a viable design choice.
The standard database/sql package provides a NullString struct (members are just String string and Valid bool). To take care of some of the repetitive work of persistence, you could look at an object-relational manager like gorp.
I looked into whether there was some way to distinguish two kinds of empty string just out of curiosity, and couldn't find one. With []bytes, []byte{} == []byte(nil) currently returns false, but I'm not sure if the spec guarantees that to always remain true. In any case, it seems like the most practical thing to do is to go with the flow and use NullString.

Resources