Validate yaml schema with golang (semantic check) - validation

We have tool which need to read YAML file with specific structure. When we got the YAML file we need to know if
Check if the YAML file is valid according to some guideline - semantic check
Where is the syntax error if any
For example this is example of the validation that we need to address
_version: {required: true}
id: {required: true, pattern: '/^[A-Za_\-\.]+$/'}
release-version: {required: true}
type:
builds:
type:seq
sequence:
-type:map
mapping:
name:{required: true, unique: true, pattern: '/^[A-Za-z0-3_\-\.]+$/'}
params:
type: map
mapping: { =: {type: any} }
Mapping is a key value object
seq can have multiple builds
type any is and key value
We use this open source to parse the yaml https://github.com/go-yaml/yaml
One idea (which is good) is to convert to json like following to do it by converting the file to json and validate it which have library to support it, any example in my context will be very helpful https://github.com/xeipuuv/gojsonschema
But not sure how I handle
Type map
Type seq

Here is what you could try.
Model a struct after the shape of the expected yaml data:
type Config struct {
Version struct {
Required bool
}
ID struct {
Required bool
Pattern string
}
ReleaseVersion struct {
Required bool
}
Type interface{}
Builds struct {
Type []interface{} `yaml:"type"`
Sequence struct {
Type string
}
Mapping struct {
Name map[string]interface{}
Params struct {
Type string `yaml:"type"`
Mapping struct {
To map[string]string `yaml:"="`
}
}
} `yaml:"mapping"`
}
}
The yaml flag yaml:"somefield" is added to label the field name of the yaml the data we're interested in.
Also many fields with unknown/undetermined type can be declared as empty interface (interface{}) or if you want to "enforce" that the underlying form is a key-value object you can either declare it as a map[string]interface{} or another struct.
We then unmarshal the yaml data into the struct:
cfg := Config{}
err := yaml.Unmarshal([]byte(data), &cfg)
if err != nil {
log.Fatalf("error: %v", err)
}
Since we have modeled fields as either anonymous structs or maps, we can test if a particular field has a "key-value" value by checking its equality to nil.
// Mapping is a key value object
if (Mapping != nil) {
// Mapping is a key-value object, since it's not nil.
}
// type any is and key value
// Mapping.To is declared with map[string]string type
// so if it's not nil we can say there's a map there.
if (Mapping.To != nil) {
// Mapping.To is a map
}
In marshaling/unmarshaling, maps and structs are pretty interchangeable. The benefit of a struct is you can predefine the field's names ahead of time while unmarshaling to a map it won't be clear to you what the keys are.

You can make go-yaml work with jsonschema. See this issue: https://github.com/santhosh-tekuri/jsonschema/issues/5
In short:
Create a custom yaml parser that produces compatible output types, as per this issue.
Parse the yaml into an interface{} using that custom parser
Validate with jsonschema.ValidateInterface.
(once yaml.v3 has been released, the custom unmarshaller should be able to be replaced with a configuration option)
I was originally using the accepted answer's approach of parsing into a struct and then writing code to manually validate that the struct met my spec. This quickly got very ugly - the above approach allows for a clean separate spec and reliable validation of it.

Related

Dynamic db models in golang

I have a yaml file of a certain configuration which is read by the go program to build a struct object.
The struct itself looks like this
type YamlConfig struct {
Attributes map[string]struct {
Label string `yaml:"label"`
Type string `yaml:"type"`
Presence bool `yaml:"presence"`
Uniqueness bool `yaml:"uniqueness"`
Strip bool `yaml:"strip"`
Values []string `yaml:"values"`
Default string `yaml:"default"`
Multi bool `yaml:"multi"`
Searchable bool `yaml:"searchable"`
Pattern struct {
Value string `yaml:"value"`
Message string `yaml:"message"`
} `yaml:"pattern"`
Length struct {
Min int `yaml:"min"`
Max int `yaml:"max"`
} `yaml:"length"`
} `yaml:"attributes"`
}
I have that map of Attributes that can be anything from "name" to "whatever", that should represent and db table columns with their types.
My question is - can I somehow take that object, which is quite dynamic and may not include the data for all the attributes' properties and convert it somehow into a usable ORM model with Gorm or something?
Should I always define a model struct or is it possible to build structs dynamically?
I think you are already using yaml.Unmarshal() for parsing your yaml config, right?
When you unmarshall yaml into a struct, you can use empty interface instead of complete yaml struct
var config map[string]interface{}
yaml.Unrmarshal(configFile, &config)
fmt.Println(config["label"])

Marshal struct field to JSON without field name

I need to marshal into this JSON format:
{"messageProtocolHandshake":[{"handshakeType":"announceMax"},{"version":[{"major":1},{"minor":0}]}]}
Problem is matching the handshakeType. My struct is
type MessageProtocolHandshake struct {
HandshakeType HandshakeType `json:"handshakeType"`
Version []Version `json:"version"`
}
type HandshakeType struct {
HandshakeType string
}
Marshaling can be done using slice of interface:
func (h MessageProtocolHandshake) MarshalJSON() ([]byte, error) {
res := make([]interface{}, 3)
res[0] = struct {
HandshakeType string `json:"handshakeType"`
}{h.HandshakeType.HandshakeType}
res[1] = struct {
Version []Version `json:"version"`
}{h.Version}
return json.Marshal(res)
}
Using a simple marshaler/unmarshaler takes away the surrounding curly brackets from the handshakeType, so that doesn't work:
{"messageProtocolHandshake":[{"handshakeType":"announceMax","version":[{"major":1,"minor":0}],"formats":[{"format":"JSON-UTF8"}]}]}
Seems as if Go applies some heuristic in that case on the retuned byte array (undocumented?).
Is there a more elegant way of omitting the structs outer field name?
--
UPDATE To summarise the answers: key is to think about different structs for marshalling and unmarshalling if nothing else works, potentially a using a 3rd presentation for working internally with the data.
When custom (Un)Marshalers come into play remember that promoted fields inherit their methods and hence influence parent structs.
The JSON that you specified has a different model from that of your struct.
There are a few approaches to aligning these: Change the specification of the JSON data to match your structs, change the structs to match the specification of the JSON, or create a new struct that is only used for marshaling.
I omit the last example, because it's very similar to the second method.
Changing the specification of the JSON
The following model stays the same:
type MessageProtocolHandshake struct {
HandshakeType HandshakeType `json:"handshakeType"`
Version []Version `json:"version"`
}
type HandshakeType struct {
HandshakeType string
}
The JSON for this would be:
{"handshakeType":{"HandshakeType":""},"version":[]}
You did not specify the Version type so I don't know how one would change the JSON for that.
Changing the structs
The following JSON stays the same:
{"messageProtocolHandshake":[{"handshakeType":"announceMax"},{"version":[{"major":1},{"minor":0}]}]}
The structs for this would be:
type Model struct {
MessageProtocolHandshake []interface{} `json:"messageProtocolHandshake"`
}
type HandshakeType struct {
HandshakeType string `json:"handshakeType"`
}
type Versions struct {
Version []Version `json:"version"`
}
type Version struct {
Major *int `json:"major,omitempty"`
Minor *int `json:"minor,omitempty"`
}
Unmarshaling would not be trivial.
https://play.golang.org/p/89WUhcMFM0B
As is obvious from the results, the models you are using are not good. If there's a way to change all of this, I would recommend starting from scratch, using the data that is necessary and creating the JSON specification from the structs.
I recommend reading up on JSON: https://www.json.org/json-en.html
Also, I recommend this introduction to Go and JSON: https://blog.golang.org/json

Conditional (Dynamic) Struct Tags

I'm trying to parse some xml documents in Go. I need to define a few structs for this purpose, and my struct tags depend on a certain condition.
Imagine the following code (even though I know it won't work)
if someCondition {
type MyType struct {
// some common fields
Date []string `xml:"value"`
}
} else {
type MyType struct {
// some common fields
Date []string `xml:"anotherValue"`
}
}
var t MyType
// do the unmarshalling ...
The problem is that these two structs have lots of fields in common. The only difference is in one of the fields and I want to prevent duplication. How can I solve this problem?
You use different types to unmarshal. Basically, you write the unmarshaling code twice and either run the first version or the second. There is no dynamic solution to this.
The simplest is probably to handle all possible fields and do some post-processing.
For example:
type MyType struct {
DateField1 []string `xml:"value"`
DateField2 []string `xml:"anotherValue"`
}
// After parsing, you have two options:
// Option 1: re-assign one field onto another:
if !someCondition {
parsed.DateField1 = parsed.DateField2
parsed.DateField2 = nil
}
// Option 2: use the above as an intermediate struct, the final being:
type MyFinalType struct {
Date []string `xml:"value"`
}
if someCondition {
final.Date = parsed.DateField1
} else {
final.Date = parsed.DateField2
}
Note: if the messages are sufficiently different, you probably want completely different types for parsing. The post-processing can generate the final struct from either.
As already indicated, you must duplicate the field. The question is where the duplication should exist.
If it's just a single field of many, one option is to use embedding, and field shadowing:
type MyType struct {
Date []string `xml:"value"`
// many other fields
}
Then when Date uses the other field name:
type MyOtherType struct {
MyType // Embed the original type for all other fields
Date []string `xml:"anotherValue"`
}
Then after unmarshaling of MyOtherType, it's easy to move the Date value into the original struct:
type data MyOtherType
err := json.Unmarshal(..., &data)
data.MyType.Date = data.Date
return data.MyType // will of MyType, and fully populated
Note that this only works for unmarshaling. If you need to also marshal this data, a similar trick can be used, but the mechanics around it must be essentially reversed.

exported field in unexported struct

Example:
type myType struct {
foo []float64
Name string
}
myType is not exported, but Name field in it is exported.
Does this make sense to do this? Is that considered a bad practice?
I have something like this, and it compiles fine.
I can access the Name field if I create an exported array of myType:
var MyArray []myType = {... some initialization }
fmt.Println(MyArray[0].Name) // Name is visible and it compiles
It is perfectly valid to have unexported structs with exported fields. If the type is declared in another package, the declaration var MyArray []myType would be a compile-time error.
While it is perfectly valid to have an exported function with an unexported return type, it is usually annoying to use. The golint tool also gives a warning for such cases:
exported func XXX returns unexported type pname.tname, which can be annoying to use
In such cases it's better to also export the type; or if you can't or don't want to do that, then create an exported interface and the exported function should have a return type of that interface, and so the implementing type may remain unexported. Since interfaces cannot have fields (only methods), this may require you to add some getter methods.
Also note that in some cases this is exactly what you want: unexported struct with exported fields. Sometimes you want to pass the struct value to some other package for processing, and in order for the other package to be able to access the fields, they must be exported (but not the struct type itself).
Good example is when you want to generate a JSON response. You may create an unexported struct, and to be able to use the encoding/json package, the fields must be exported. For example:
type response struct {
Success bool `json:"success"`
Message string `json:"message"`
Data string `json:"data"`
}
func myHandler(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json;charset=UTF-8")
resp := &response{
Success: true,
Message: "OK",
Data: "some data",
}
if err := json.NewEncoder(w).Encode(resp); err != nil {
// Handle err
}
}

How do I Unmarshal the bson from mongo of a nested interface with mgo?

I have a collection of documents that contain an array of a custom interface type that I have. Example below. What do I need to do to Unmarshal the bson from mongo so I can eventually return a JSON response?
type Document struct {
Props here....
NestedDocuments customInterface
}
What do I need to do to map the nested interfaces to the right structs?
I think it is evident that an interface cannot be instantiated, therefore the bson runtime does not know which struct has to be used to Unmarshal that object. Additionally, your customInterface type should be exported (i.e. with capital "C") otherwise it won't be accessible from the bson runtime.
I suspect that using an interface implies that the NestedDocuments array may contain different types, all implementing customInterface.
If that's the case, I am afraid that you will have to do some changes:
Firstly, NestedDocument need to be a struct holding your document plus some information to help the decoder understand what's the underpinning type. Something like:
type Document struct {
Props here....
Nested []NestedDocument
}
type NestedDocument struct {
Kind string
Payload bson.Raw
}
// Document provides
func (d NestedDocument) Document() (CustomInterface, error) {
switch d.Kind {
case "TypeA":
// Here I am safely assuming that TypeA implements CustomInterface
result := &TypeA{}
err := d.Payload.Unmarshal(result)
if err != nil {
return nil, err
}
return result, nil
// ... other cases and default
}
}
In this way, the bson runtime will decode the whole Document but leave the payload as a []byte.
Once you have decoded the main Document, you can use the NestedDocument.Document() function to get a concrete representation of your struct.
One last thing; when you persist your Document, make sure that Payload.Kind is set to 3, which represents an embedded document. See the BSON specifications for more details on this.
Hope it is all clear and good luck with your project.

Resources