What are the use(s) for struct tags in Go? - go

In the Go Language Specification, it mentions a brief overview of tags:
A field declaration may be followed by an optional string literal tag,
which becomes an attribute for all the fields in the corresponding
field declaration. The tags are made visible through a reflection
interface but are otherwise ignored.
// A struct corresponding to the TimeStamp protocol buffer.
// The tag strings define the protocol buffer field numbers.
struct {
microsec uint64 "field 1"
serverIP6 uint64 "field 2"
process string "field 3"
}
This is a very short explanation IMO, and I was wondering if anyone could provide me with what use these tags would be?

A tag for a field allows you to attach meta-information to the field which can be acquired using reflection. Usually it is used to provide transformation info on how a struct field is encoded to or decoded from another format (or stored/retrieved from a database), but you can use it to store whatever meta-info you want to, either intended for another package or for your own use.
As mentioned in the documentation of reflect.StructTag, by convention the value of a tag string is a space-separated list of key:"value" pairs, for example:
type User struct {
Name string `json:"name" xml:"name"`
}
The key usually denotes the package that the subsequent "value" is for, for example json keys are processed/used by the encoding/json package.
If multiple information is to be passed in the "value", usually it is specified by separating it with a comma (','), e.g.
Name string `json:"name,omitempty" xml:"name"`
Usually a dash value ('-') for the "value" means to exclude the field from the process (e.g. in case of json it means not to marshal or unmarshal that field).
Example of accessing your custom tags using reflection
We can use reflection (reflect package) to access the tag values of struct fields. Basically we need to acquire the Type of our struct, and then we can query fields e.g. with Type.Field(i int) or Type.FieldByName(name string). These methods return a value of StructField which describes / represents a struct field; and StructField.Tag is a value of type StructTag which describes / represents a tag value.
Previously we talked about "convention". This convention means that if you follow it, you may use the StructTag.Get(key string) method which parses the value of a tag and returns you the "value" of the key you specify. The convention is implemented / built into this Get() method. If you don't follow the convention, Get() will not be able to parse key:"value" pairs and find what you're looking for. That's also not a problem, but then you need to implement your own parsing logic.
Also there is StructTag.Lookup() (was added in Go 1.7) which is "like Get() but distinguishes the tag not containing the given key from the tag associating an empty string with the given key".
So let's see a simple example:
type User struct {
Name string `mytag:"MyName"`
Email string `mytag:"MyEmail"`
}
u := User{"Bob", "bob#mycompany.com"}
t := reflect.TypeOf(u)
for _, fieldName := range []string{"Name", "Email"} {
field, found := t.FieldByName(fieldName)
if !found {
continue
}
fmt.Printf("\nField: User.%s\n", fieldName)
fmt.Printf("\tWhole tag value : %q\n", field.Tag)
fmt.Printf("\tValue of 'mytag': %q\n", field.Tag.Get("mytag"))
}
Output (try it on the Go Playground):
Field: User.Name
Whole tag value : "mytag:\"MyName\""
Value of 'mytag': "MyName"
Field: User.Email
Whole tag value : "mytag:\"MyEmail\""
Value of 'mytag': "MyEmail"
GopherCon 2015 had a presentation about struct tags called:
The Many Faces of Struct Tags (slide) (and a video)
Here is a list of commonly used tag keys:
json      - used by the encoding/json package, detailed at json.Marshal()
xml       - used by the encoding/xml package, detailed at xml.Marshal()
bson      - used by gobson, detailed at bson.Marshal(); also by the mongo-go driver, detailed at bson package doc
protobuf  - used by github.com/golang/protobuf/proto, detailed in the package doc
yaml      - used by the gopkg.in/yaml.v2 package, detailed at yaml.Marshal()
db        - used by the github.com/jmoiron/sqlx package; also used by github.com/go-gorp/gorp package
orm       - used by the github.com/astaxie/beego/orm package, detailed at Models – Beego ORM
gorm      - used by gorm.io/gorm, examples can be found in their docs
valid     - used by the github.com/asaskevich/govalidator package, examples can be found in the project page
datastore - used by appengine/datastore (Google App Engine platform, Datastore service), detailed at Properties
schema    - used by github.com/gorilla/schema to fill a struct with HTML form values, detailed in the package doc
asn       - used by the encoding/asn1 package, detailed at asn1.Marshal() and asn1.Unmarshal()
csv       - used by the github.com/gocarina/gocsv package
env - used by the github.com/caarlos0/env package

Here is a really simple example of tags being used with the encoding/json package to control how fields are interpreted during encoding and decoding:
Try live: http://play.golang.org/p/BMeR8p1cKf
package main
import (
"fmt"
"encoding/json"
)
type Person struct {
FirstName string `json:"first_name"`
LastName string `json:"last_name"`
MiddleName string `json:"middle_name,omitempty"`
}
func main() {
json_string := `
{
"first_name": "John",
"last_name": "Smith"
}`
person := new(Person)
json.Unmarshal([]byte(json_string), person)
fmt.Println(person)
new_json, _ := json.Marshal(person)
fmt.Printf("%s\n", new_json)
}
// *Output*
// &{John Smith }
// {"first_name":"John","last_name":"Smith"}
The json package can look at the tags for the field and be told how to map json <=> struct field, and also extra options like whether it should ignore empty fields when serializing back to json.
Basically, any package can use reflection on the fields to look at tag values and act on those values. There is a little more info about them in the reflect package
http://golang.org/pkg/reflect/#StructTag :
By convention, tag strings are a concatenation of optionally
space-separated key:"value" pairs. Each key is a non-empty string
consisting of non-control characters other than space (U+0020 ' '),
quote (U+0022 '"'), and colon (U+003A ':'). Each value is quoted using
U+0022 '"' characters and Go string literal syntax.

It's some sort of specifications that specifies how packages treat with a field that is tagged.
for example:
type User struct {
FirstName string `json:"first_name"`
LastName string `json:"last_name"`
}
json tag informs json package that marshalled output of following user
u := User{
FirstName: "some first name",
LastName: "some last name",
}
would be like this:
{"first_name":"some first name","last_name":"some last name"}
other example is gorm package tags declares how database migrations must be done:
type User struct {
gorm.Model
Name string
Age sql.NullInt64
Birthday *time.Time
Email string `gorm:"type:varchar(100);unique_index"`
Role string `gorm:"size:255"` // set field size to 255
MemberNumber *string `gorm:"unique;not null"` // set member number to unique and not null
Num int `gorm:"AUTO_INCREMENT"` // set num to auto incrementable
Address string `gorm:"index:addr"` // create index with name `addr` for address
IgnoreMe int `gorm:"-"` // ignore this field
}
In this example for the field Email with gorm tag we declare that corresponding column in database for the field email must be of type varchar and 100 maximum length and it also must have unique index.
other example is binding tags that are used very mostly in gin package.
type Login struct {
User string `form:"user" json:"user" xml:"user" binding:"required"`
Password string `form:"password" json:"password" xml:"password" binding:"required"`
}
var json Login
if err := c.ShouldBindJSON(&json); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": err.Error()})
return
}
the binding tag in this example gives hint to gin package that the data sent to API must have user and password fields cause these fields are tagged as required.
So generraly tags are data that packages require to know how should they treat with data of type different structs and best way to get familiar with the tags a package needs is READING A PACKAGE DOCUMENTATION COMPLETELY.

Related

How to handle NaN values when writing to parquet in GO?

I am trying to write to a parquet file in GO. While writing to this file, I can get NaN values. Since NaN is neither defined in the primitive types nor in logical type then how do I handle this value in GO? Does any existing schema work for it?
I am using the parquet GO library from here. You can find an example of the code using JSON schema for writing to parquet here using this library.
The isse was discussed at lenght in xitongsys/parquet-go issue 281, with the recommandation being to
use OPTIONAL type.
Even you don't assign a value (like you code), the non-point value will be assigned a default value.
So parquet-go don't know it's null or default value.
However:
What is comes down to is that I cannot use the OPTIONAL type, in other words I cannot convert my structure to use pointers.
I have tried to use repetitiontype=OPTIONAL as a tag, but this leads to some weird behavior.
I would expect that tag to behave the same way that the omitempty tag in the Golang standard library, i.e. if the value is not present then it is not put into the JSON.
The reason this is important is that if the field is missing or not set, when it is encoded to parquet then there is no way of telling if the value was 0 or just not set in the case of int64.
This illustrates the issue:
package main
import (
"encoding/json"
"io/ioutil"
)
type Salary struct {
Basic, HRA, TA float64 `json:",omitempty"`
}
type Employee struct {
FirstName, LastName, Email string `json:",omitempty"`
Age int
MonthlySalary []Salary `json:",omitempty"`
}
func main() {
data := Employee{
Email: "mark#gmail.com",
MonthlySalary: []Salary{
{
Basic: 15000.00,
},
},
}
file, _ := json.MarshalIndent(data, "", " ")
_ = ioutil.WriteFile("test.json", file, 0o644)
}
with a JSON produced as:
{
"Email": "mark#gmail.com",
"Age": 0,
"MonthlySalary": [
{
"Basic": 15000
}
]
}
As you can see, the item in the struct that have the omit empty tag and that are not assigned do no appear in the JSON, i.e. HRA TA.
But on the other hand Age does not have this tag and hence it is still included in the JSON.
This is problematic as all fields in the struct are assigned memory when this golang library writes to parquet- so if you have a big struct that is only sparsely populated it will still take the full amount of memory.
It is a bigger problem when the file is read again as there is no way of know if the value that was put in the parquet file was the empty value or it is was just not assigned.
I am happy to help implement an omitempty tag for this library if I can convince you of the value of having it.
That echoes issue 403 "No option to omitempty when not using pointers".

How to use mapstructure tag as json tag?

I'm introducing a package from a third party that has this struct with mapstructure tag.
I want the instance of this struct to be json with mapstructure specified value.What should I do?
I can add json tag, but in doing so,I modify package files,I think this is a bad way.
type ServiceConfig struct {
// name of the service
Name string `mapstructure:"name"`
// set of endpoint definitions
Endpoints string `mapstructure:"end_points"`
// defafult timeout
Timeout time.Duration `mapstructure:"timeout"`
}
I want to get:
{"name":"sss", "end_points" :"xxx", "timeout" : "120"}
If you do not want to modify the package files, you can create another struct with the same field names, but with JSON tags, and copy:
type JSONServiceConfig struct {
Name string `json:"name"`
Endpoints string `json:"end_points"`
Timeout time.Duration `json:"timeout"`
}
Then:
x := JSONServiceConfig(serviceConfig)
You cannot do what you want without modifying the mapstructure source, and it would probably get a little bit hairy if you want to specify options, such as json's omitempty. However, you can simply add a second struct tag for this
type ServiceConfig struct {
// name of the service
Name string `mapstructure:"name" json:"name"`
// set of endpoint definitions
Endpoints string `mapstructure:"end_points" json:"end_points"`
// defafult timeout
Timeout time.Duration `mapstructure:"timeout" json:"timeout"`
}
From the documentation of reflect
By convention, tag strings are a concatenation of optionally
space-separated key:"value" pairs. Each key is a non-empty string
consisting of non-control characters other than space (U+0020 ' '),
quote (U+0022 '"'), and colon (U+003A ':'). Each value is quoted using
U+0022 '"' characters and Go string literal syntax.
Here's a simple example on the playground

Empty or not required struct fields

I have two structs that represent models that will be inserted into a mongodb database. One struct (Investment) has the other struct (Group) as one of its fields.
type Group struct {
Base
Name string `json:"name" bson"name"`
}
type Investment struct {
Base
Symbol string `json:"symbol" bson:"symbol" binding:"required"`
Group Group `json:"group" bson:"group"`
Fields bson.M `json:"fields" bson:"fields"`
}
The problem I'm having is that in the Investment model, Group is not required. If there is no group, I think its better for it to not be inserted in the db. Whats the best way to handle a db model such as this in Go?
tl;dr: Use ,omitempty, and if you need to worry about the difference between a zero value and null/not specified, do what the GitHub API does and use a pointer.
Both json and bson support the ,omitempty tag. For json, "empty values are false, 0, any nil pointer or interface value, and any array, slice, map, or string of length zero" (json docs). For bson, ,omitempty means "Only include the field if it's not set to the zero value for the type or to empty slices or maps", and zero values include empty strings and nil pointers (bson docs).
So if you really need a Group struct, you can put a *Group in instead, and it won't be stored when the pointer is nil. If Investment only needs to hold the group's name, it's even simpler: "" as group name keeps a group key from being stored.
bson defaults to using the lowercased field name already so you can omit that from the struct tag when they match. json will default to the Capitalized name, so specify the lowercase name in a tag if you need lowercase.
So, best case, maybe you can just use:
type Investment struct {
Base
Symbol string `json:"symbol" binding:"required"`
Group string `json:"group,omitempty" bson:",omitempty"`
Fields bson.M `json:"fields"`
}
If you ever run into fields where the zero value for the type ("", 0, false, etc.) is distinct from "not specified", you can do what the GitHub API does and put pointers in your structures--essentially an extension of the *Group trick.
Avoid strut fields to marshal if they are empty -
A struct field may be primitive type(string, int, bool etc) or even an another struct type.
So sometimes we don't want a struct's field to
go in json data(may to database insertion or in external api call) if they are empty
Example:
type Investment struct {
Base
Symbol string `json:"symbol" bson:"symbol" binding:"required"`
Group Group `json:"group" bson:"group"`
Fields bson.M `json:"fields" bson:"fields"`
}
If we want that Symbol and Group might contain empty values(0, false, nil pointer, zero size interface/struct) then we can avoid them in json marshaling like below.
type Investment struct {
Base
Symbol string `json:"symbol,omitempty" bson:"symbol,omitempty" binding:"required"`
Group *Group `json:"group,omitempty" bson:"group,omitempty"`
Fields bson.M `json:"fields" bson:"fields"`
}
Her "Group" field is pointer to Group struct and whenever it will point to nil pointer it will be omitted from json marshaling.
And obviously we would be filling values in Group field like below.
// declared investment variable of type Investment struct
investment.Group = &groupData

Properly distinguish between not set (nil) and blank/empty value

Whats the correct way in go to distinguish between when a value in a struct was never set, or is just empty, for example, given the following:
type Organisation struct {
Category string
Code string
Name string
}
I need to know (for example) if the category was never set, or was saved as blank by the user, should I be doing this:
type Organisation struct {
Category *string
Code *string
Name *string
}
I also need to ensure I correctly persist either null or an empty string to the database
I'm still learning GO so it is entirely possible my question needs more info.
The zero value for a string is an empty string, and you can't distinguish between the two.
If you are using the database/sql package, and need to distinguish between NULL and empty strings, consider using the sql.NullString type. It is a simple struct that keeps track of the NULL state:
type NullString struct {
String string
Valid bool // Valid is true if String is not NULL
}
You can scan into this type and use it as a query parameter, and the package will handle the NULL state for you.
Google's protocol buffers (https://code.google.com/p/goprotobuf/) use pointers to describe optional fields.
The generated objects provide GetFoo methods which take the pain away from testing for nil (a.GetFoo() returns an empty string if a.Foo is nil, otherwise it returns *a.Foo).
It introduces a nuisance when you want to write literal structs (in tests, for example), because &"something" is not valid syntax to generate a pointer to a string, so you need a helper function (see, for example, the source code of the protocol buffer library for proto.String).
// String is a helper routine that allocates a new string value
// to store v and returns a pointer to it.
func String(v string) *string {
return &v
}
Overall, using pointers to represent optional fields is not without drawbacks, but it's certainly a viable design choice.
The standard database/sql package provides a NullString struct (members are just String string and Valid bool). To take care of some of the repetitive work of persistence, you could look at an object-relational manager like gorp.
I looked into whether there was some way to distinguish two kinds of empty string just out of curiosity, and couldn't find one. With []bytes, []byte{} == []byte(nil) currently returns false, but I'm not sure if the spec guarantees that to always remain true. In any case, it seems like the most practical thing to do is to go with the flow and use NullString.

Strange type definition syntax in Golang (name, then type, then string literal)

I've been trying to find out how to use mgo (MongoDB driver for Go) and I came across this struct declaration:
type Something struct {
Id bson.ObjectId "_id,omitempty"
Name string
}
I don't quite understand the syntax of the first element (Id). I understand that it's being declared as type bson.ObjectId, but what is the string literal doing there?
My question is not about the mgo driver functionality,
but about this strange <name> <type> <string_literal> syntax.
I couldn't find anything on the Go specs, and I don't know how to google this either.
It's explained in the Struct types section of the language specification:
A field declaration may be followed by an optional string literal
tag, which becomes an attribute for all the fields in the corresponding field declaration. The tags are made visible through a
reflection interface but are otherwise ignored.
// A struct corresponding to the TimeStamp protocol buffer.
// The tag strings define the protocol buffer field numbers.
struct {
microsec uint64 "field 1"
serverIP6 uint64 "field 2"
process string "field 3"
}

Resources