goavro not able to validate json data with schema - go

I'm new to go and avro and struggling with validating data.I have this avro schema
{
"namespace": "com.input",
"name": "parent",
"type": "record",
"fields": [
{
"name": "field1",
"type": [
"null",
{
"type": "record",
"name": "child",
"fields": [
{
"name": "child1",
"type": "string"
},
{
"name": "child2",
"type": "string"
}
]
}
]
}
]
}
Data to be validated:
{
"field1": {
"child1": "1",
"child2": "abc"
}
}
This is the code which i'm using to validate using goavro library:
func loadMockData() (stringFormatData string) {
mockData, err := ioutil.ReadFile("sample.json")
if err != nil {
log.Println(err)
}
return string(mockData)
}
func loadSchema() (stringFormatData string) {
mockSchema, err := ioutil.ReadFile("schema.avsc")
if err != nil {
log.Println(err)
}
return string(mockSchema)
}
avroSchema := loadSchema()
jsonString := loadMockData()
codec, err := goavro.NewCodec(avroSchema)
decoded, _, err := codec.NativeFromTextual([]byte(jsonString))
This gives me following error:
NativeFromTextual error: cannot decode textual record "com.input.parent": cannot decode textual union: cannot decode textual map: cannot determine codec: "parentid"
I tried debugging the goavro library, seems that fieldcodec remains nil for parentid in map.go. With simple fields this setting works fine. Problem is with nested data its not able to get codec for parentid.
Any help is much appreciated.

Related

When the first stage is array, How to handle with go-simplejson

JSON struct like below:
[
{
"sha": "eb08dc1940e073a5c40d8b53a5fd58760fde8f27",
"node_id": "C_kwDOHb9FrtoAKGViMDhkYzE5NDBlMDczYTVjNDBkOGI1M2E1ZmQ1ODc2MGZkZThmMjc",
"commit": {
"author": {
"name": "xxxx"
},
"committer": {
"name": "xxxxx"
},
"message": "update DownLoad_Stitch_ACM.py",
"tree": {
"sha": "a30aab98319846f0e86da4a39ec05786e04c0a4f",
"url": "xxxxx"
},
"url": "xxxxx",
"comment_count": 0,
"verification": {
"verified": false,
"reason": "unsigned",
"signature": null,
"payload": null
}
},
"url": "xxxxx",
"html_url": "xxxxx",
"comments_url": "xxxxx",
"author": {
"login": "xxxxx",
"id": "xxxxx",
"node_id": "U_kgDOBkuicQ",
"avatar_url": "https://avatars.githubusercontent.com/u/105620081?v=4",
"gravatar_id": "",
"type": "User",
"site_admin": false
},
"committer": {
"login": "xxxxx",
"id": "xxxxx"
},
"parents": [
{
"sha": "cf867ec9dc4b904c466d9ad4b9338616d1213a06",
"url": "xxxxx",
"html_url": "xxxxx"
}
]
}
]
I don't know how to get the location 0's data.
content, _ := simplejson.NewJson(body)
arr, _ := content.Array() // Here can get the all data, It's []interface{} type.
I cannot get the next data with arr[0]["sha"]. How to handle it?
It is not clear to the compiler that arr is an array of map[string]interface{} at compile time, as arr[0] is of type interface{}. This basically means that the compiler knows nothing about this type, which is why you can't do a map lookup operation here.
You can add a type assertion to make sure you can use it as a map like this:
asMap := arr[0].(map[string]interface{})
fmt.Println(asMap["sha"])
To get the SHA as string, you can again add a type assertion behind it as well:
asString := asMap["sha"].(string)
This is also shown in this working example. The downside of this is that your program will panic in case the given data is not of the specified type. You could instead use a type assertion with a check if it worked (asString, ok := ...), but it gets cumbersome with more complex data.
This does work, but isn't really nice. I would recommend using a tool like this to generate Go structs and then use them in a type-safe way. First define a struct with all the info you need:
type ArrayElement struct {
Sha string `json:"sha"`
// Add more fields if you need them
}
Then you can just use the standard-library json package to unmarshal your data:
// This should have the same structure as the data you want to parse
var result []ArrayElement
err := json.Unmarshal([]byte(str), &result)
if err != nil {
panic(err)
}
fmt.Println(result[0].Sha)
Here is an example for that -- this is a more Go-like approach for converting JSON data.
Your json data is wrong formatted. First of all, remove , after "id": "xxxxx", line:
...
"id": "xxxxx"
...
You should check errors after NewJson to prevent find out if there is a problem:
content, err := simplejson.NewJson(body)
if err != nil {
// log err
}
For getting sha from first index, you simply can use simplejson built-in methods:
shaVal := content.GetIndex(0).Get("sha").String()
Here is how you can get the desired value.
This worked for me.
package main
import (
"encoding/json"
"fmt"
)
type MyData []struct {
Sha string `json:"sha"`
NodeID string `json:"node_id"`
Commit struct {
Author struct {
Name string `json:"name"`
} `json:"author"`
Committer struct {
Name string `json:"name"`
} `json:"committer"`
Message string `json:"message"`
Tree struct {
Sha string `json:"sha"`
URL string `json:"url"`
} `json:"tree"`
URL string `json:"url"`
CommentCount int `json:"comment_count"`
Verification struct {
Verified bool `json:"verified"`
Reason string `json:"reason"`
Signature interface{} `json:"signature"`
Payload interface{} `json:"payload"`
} `json:"verification"`
} `json:"commit"`
URL string `json:"url"`
HTMLURL string `json:"html_url"`
CommentsURL string `json:"comments_url"`
Author struct {
Login string `json:"login"`
ID string `json:"id"`
NodeID string `json:"node_id"`
AvatarURL string `json:"avatar_url"`
GravatarID string `json:"gravatar_id"`
Type string `json:"type"`
SiteAdmin bool `json:"site_admin"`
} `json:"author"`
Committer struct {
Login string `json:"login"`
ID string `json:"id"`
} `json:"committer"`
Parents []struct {
Sha string `json:"sha"`
URL string `json:"url"`
HTMLURL string `json:"html_url"`
} `json:"parents"`
}
func main() {
my_json_data := `[
{
"sha": "eb08dc1940e073a5c40d8b53a5fd58760fde8f27",
"node_id": "C_kwDOHb9FrtoAKGViMDhkYzE5NDBlMDczYTVjNDBkOGI1M2E1ZmQ1ODc2MGZkZThmMjc",
"commit": {
"author": {
"name": "xxxx"
},
"committer": {
"name": "xxxxx"
},
"message": "update DownLoad_Stitch_ACM.py",
"tree": {
"sha": "a30aab98319846f0e86da4a39ec05786e04c0a4f",
"url": "xxxxx"
},
"url": "xxxxx",
"comment_count": 0,
"verification": {
"verified": false,
"reason": "unsigned",
"signature": null,
"payload": null
}
},
"url": "xxxxx",
"html_url": "xxxxx",
"comments_url": "xxxxx",
"author": {
"login": "xxxxx",
"id": "xxxxx",
"node_id": "U_kgDOBkuicQ",
"avatar_url": "https://avatars.githubusercontent.com/u/105620081?v=4",
"gravatar_id": "",
"type": "User",
"site_admin": false
},
"committer": {
"login": "xxxxx",
"id": "xxxxx"
},
"parents": [
{
"sha": "cf867ec9dc4b904c466d9ad4b9338616d1213a06",
"url": "xxxxx",
"html_url": "xxxxx"
}
]
}]`
var data MyData
err := json.Unmarshal([]byte(my_json_data), &data)
if err != nil {
panic(err)
}
fmt.Println("data --> sha: ", data[0].Sha)
}

goavro and original Go data structure

How can I recreate original data structure in Golang serialized with avro using goavro?
With this library https://github.com/hamba/avro it's quite easy.
out := SimpleRecord{}
err = avro.Unmarshal(schema, data, &out)
type of variable out is SimpleRecord.
Let's say I have this struct and avro schema:
type SimpleRecord struct {
F1 int `avro:"f1"`
F2 string `avro:"f2"`
F3 string `avro:"f3"`
Dependencies []string `avro:"dependencies"`
}
func main() {
avro_schema_txt := `{
"type": "record",
"name": "AvroData",
"namespace": "data.avro",
"doc": "docstring",
"fields": [
{
"name": "f1",
"type": "int"
},
{
"name": "f2",
"type": "string"
},
{
"name": "f3",
"type": "string"
},
{
"name": "dependencies",
"type": {
"type": "array",
"items": "string"
}
}
]
}`
}
and then
codec, err := goavro.NewCodec(avro_schema_txt)
if err != nil {
log.Fatal(err.Error())
}
out, _, err := codec.NativeFromBinary(data)
if err != nil {
log.Fatal(err.Error())
}
fmt.Println(out)
where data is marshaled with avro, out is of type interface{}, so how can I "make" it SimpleRecord?
There are two ways you can do it, either by casting out and doing a 'manual mapping' or by using codec.TextualFromNative. Both approaches are shown below for completeness,
Approach 1
Cast out from interface{} to map[string]interface{} and retrieve the values
...
simpleMap := out.(map[string]interface{})
f1 := simpleMap["f1"]
f2 := simpleMap["f2"]
...
SimpleRecord {
F1: f1,
F2: f2,
...
}
Approach 2
Use TextualFromNative, the below code shows both encode decode process
var avro_schema_txt = `{
"type": "record",
"name": "AvroData",
"namespace": "data.avro",
"doc": "docstring",
"fields": [
{
"name": "f1",
"type": "int"
},
{
"name": "f2",
"type": "string"
},
{
"name": "f3",
"type": "string"
},
{
"name": "dependencies",
"type": {
"type": "array",
"items": "string"
}
}
]
}`
// added json to match field names in avro
type SimpleRecord struct {
F1 int `avro:"f1" json:"f1"`
F2 string `avro:"f2" json:"f2"`
F3 string `avro:"f3" json:"f3"`
Dependencies []string `avro:"dependencies" json:"dependencies"`
}
func encodeDecode() {
data := SimpleRecord{
F1: 1,
F2: "tester2",
F3: "tester3",
Dependencies: []string { "tester4", "tester5" },
}
codec, err := goavro.NewCodec(avro_schema_txt)
if err != nil {
log.Fatal(err.Error())
}
// encode
textualIn, err := json2.Marshal(data)
if err != nil {
log.Fatal(err.Error())
}
nativeIn, _, err := codec.NativeFromTextual(textualIn)
if err != nil {
log.Fatal(err.Error())
}
binaryIn, err := codec.BinaryFromNative(nil, nativeIn)
if err != nil {
log.Fatal(err.Error())
}
// decode
nativeOut, _, err := codec.NativeFromBinary(binaryIn)
if err != nil {
log.Fatal(err.Error())
}
textualOut, err := codec.TextualFromNative(nil, nativeOut)
if err != nil {
log.Fatal(err.Error())
}
var out = SimpleRecord{}
err = json2.Unmarshal(textualOut, &out)
if err != nil {
log.Fatal(err.Error())
}
if !reflect.DeepEqual(data, out) {
log.Fatal("should be equal")
}
}

Unmarshalling nested json object from http request returns nil

I've been going through other similar questions here but I don't know what I'm doing wrong.
I am calling this API:
https://coronavirus-tracker-api.herokuapp.com/v2/locations
Which returns a JSON object like this one:
{
"latest": {
"confirmed": 272166,
"deaths": 11299,
"recovered": 87256
},
"locations": [
{
"id": 0,
"country": "Thailand",
"country_code": "TH",
"province": "",
"last_updated": "2020-03-21T06:59:11.315422Z",
"coordinates": {
"latitude": "15",
"longitude": "101"
},
"latest": {
"confirmed": 177,
"deaths": 1,
"recovered": 41
}
},
{
"id": 39,
"country": "Norway",
"country_code": "NO",
"province": "",
"last_updated": "2020-03-21T06:59:11.315422Z",
"coordinates": {
"latitude": "60.472",
"longitude": "8.4689"
},
"latest": {
"confirmed": 1463,
"deaths": 3,
"recovered": 1
}
}
]
}
So I have written a small program to parse it but I can only parse the outer object ("latest") while the inner array ("locations") always returns nil.
Code is here (even if TCP calls don't work on the playground):
https://play.golang.org/p/ma225d07iRA
and here:
package main
import (
"encoding/json"
"fmt"
"net/http"
"time"
)
type AutoGenerated struct {
Latest Latest `json:"latest"`
Locations []Locations `json:"locations"`
}
type Latest struct {
Confirmed int `json:"confirmed"`
Deaths int `json:"deaths"`
Recovered int `json:"recovered"`
}
type Coordinates struct {
Latitude string `json:"latitude"`
Longitude string `json:"longitude"`
}
type Locations struct {
ID int `json:"id"`
Country string `json:"country"`
CountryCode string `json:"country_code"`
Province string `json:"province"`
LastUpdated time.Time `json:"last_updated"`
Coordinates Coordinates `json:"coordinates"`
Latest Latest `json:"latest"`
}
var latestUrl = "https://coronavirus-tracker-api.herokuapp.com/v2/latest"
func getJson(url string, target interface{}) {
req, err := http.NewRequest("GET", url, nil)
if err != nil {
fmt.Println(err)
}
req.Header.Add("content-type", "application/json")
res, err := http.DefaultClient.Do(req)
if err != nil {
fmt.Println(err)
}
decoder := json.NewDecoder(res.Body)
var data AutoGenerated
err = decoder.Decode(&data)
if err != nil {
fmt.Println(err)
}
for i, loc := range data.Locations {
fmt.Printf("%d: %s", i, loc.Country)
}
defer res.Body.Close()
}
func main() {
var a AutoGenerated
getJson(latestUrl, &a)
}
The problem is that the endpoint https://coronavirus-tracker-api.herokuapp.com/v2/latest does not return locations. This is the response I get by calling it:
{
"latest": {
"confirmed": 304524,
"deaths": 12973,
"recovered": 91499
}
}
However if you call the correct endpoint https://coronavirus-tracker-api.herokuapp.com/v2/locations, it might work.

Custom built JSON schema not validating properly

I have a custom built JSON schema that only has a few more top-levels. The problem here is that it doesn't validate everything to 100%. For example, it only detects 2 out of 4 fields, and the required fields do not work at all, neither does additionalproperties, etc. I'm using this library for my json schema.
{
"users": {
"PUT": {
"definitions": {},
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "http://example.com/root.json",
"type": "object",
"title": "The Root Schema",
"required": [
"DisplayName",
"Username",
"Email",
"Password"
],
"properties": {
"DisplayName": {
"$id": "#/properties/DisplayName",
"type": "string",
"title": "The Displayname Schema",
"default": "",
"examples": [
""
],
"minLength": 3,
"maxLength": 24,
"pattern": "^(.*)$"
},
"Username": {
"$id": "#/properties/Username",
"type": "string",
"title": "The Username Schema",
"default": "",
"examples": [
""
],
"minLength": 3,
"maxLength": 15,
"pattern": "^(.*)$"
},
"Email": {
"$id": "#/properties/Email",
"type": "string",
"title": "The Email Schema",
"default": "",
"examples": [
""
],
"minLength": 7,
"pattern": "^(.*)$",
"format": "email"
},
"Password": {
"$id": "#/properties/Password",
"type": "string",
"title": "The Password Schema",
"default": "",
"examples": [
""
],
"pattern": "^(.*)$"
}
},
"additionalProperties": false
}
}
}
I'm parsing everything like this:
func Validate(data interface{}, r *http.Request) (interface{}, error) {
// Convert the data struct to a readable JSON bytes
JSONparams, err := json.Marshal(data)
if err != nil {
return nil, err
}
// Split URL segments so we know what part of the API they are accessing
modules := strings.Split(r.URL.String(), "/")
modules = modules[(len(modules) - 1):]
// Read the schema file
fileSchema, _ := ioutil.ReadFile("config/schema/schema.json")
var object interface{}
// Unmarshal it so we can choose what schema we specifically want
err = json.Unmarshal(fileSchema, &object)
if err != nil {
log.Fatal(err)
}
// Choose the preferred schema
encodedJSON, err := json.Marshal(object.(map[string]interface{})[strings.Join(modules, "") + "s"].(map[string]interface{})[r.Method])
if err != nil {
log.Fatal(err)
}
// Load the JSON schema
schema := gojsonschema.NewStringLoader(string(encodedJSON))
// Load the JSON params
document := gojsonschema.NewStringLoader(string(JSONparams))
// Validate the document
result, err := gojsonschema.Validate(schema, document)
if err != nil {
return nil, err
}
if !result.Valid() {
// Map the errors into a new array
var errors = make(map[string]string)
for _, err := range result.Errors() {
errors[err.Field()] = err.Description()
}
// Convert the array to an interface that we can convert to JSON
resultMap := map[string]interface{}{
"success": false,
"result": map[string]interface{}{},
"errors": errors,
}
// Convert the interface to a JSON object
errorObject, err := json.Marshal(resultMap)
if err != nil {
return nil, err
}
return errorObject, nil
}
return nil, nil
}
type CreateParams struct {
DisplayName string
Username string
Email string
Password string
}
var (
response interface{}
status int = 0
)
func Create(w http.ResponseWriter, r *http.Request) {
status = 0
// Parse the request so we can access the query parameters
r.ParseForm()
// Assign them to the interface variables
data := &CreateParams{
DisplayName: r.Form.Get("DisplayName"),
Username: r.Form.Get("Username"),
Email: r.Form.Get("Email"),
Password: r.Form.Get("Password"),
}
// Validate the JSON data
errors, err := schema.Validate(data, r)
if err != nil {
responseJSON := map[string]interface{}{
"success": false,
"result": map[string]interface{}{},
}
log.Fatal(err.Error())
response, err = json.Marshal(responseJSON)
status = http.StatusInternalServerError
}
// Catch any errors generated by the validator and assign them to the response interface
if errors != nil {
response = errors
status = http.StatusBadRequest
}
// Status has not been set yet, so it's safe to assume that everything went fine
if status == 0 {
responseJSON := map[string]interface{}{
"success": true,
"result": map[string]interface{} {
"DisplayName": data.DisplayName,
"Username": data.Username,
"Email": data.Email,
"Password": nil,
},
}
response, err = json.Marshal(responseJSON)
status = http.StatusOK
}
// We are going to respond with JSON, so set the appropriate header
w.Header().Set("Content-Type", "application/json")
// Write the header and the response
w.WriteHeader(status)
w.Write(response.([]byte))
}
The reason to why I'm doing it like this is I'm building a REST API and if api/auth/user gets a PUT request, I want to be able to specify the data requirements for specifically the "users" parts with the PUT method.
Any idea how this can be achieved?
EDIT:
My json data:
{
"DisplayName": "1234",
"Username": "1234",
"Email": "test#gmail.com",
"Password": "123456"
}
EDIT 2:
This data should fail with the schema.
{
"DisplayName": "1", // min length is 3
"Username": "", // this field is required but is empty here
"Email": "testgmail.com", // not following the email format
"Password": "123456111111111111111111111111111111111111111111111" // too long
}
If I manually load the schema and data using gojsonschema it works as expected. I suspect that since you're loading the schema in a somewhat complicated fashion the schema you put in ends up being something different than what you'd expect, but since your code samples are all HTTP based I can't really test it out myself.

Protocol buffer serialization Golang

I am using DialogFlow V2 official GoLang SDK. In my webhook, I am returning a payload, which I'm obtaining using the function GetWebhookPayload().
This returns *google_protobuf4.Struct. I would like to turn this struct into a map[string]interface{}. How is this possible?
This is what the struct looks like when serialized:
"payload": {
"fields": {
"messages": {
"Kind": {
"ListValue": {
"values": [
{
"Kind": {
"StructValue": {
"fields": {
"title": {
"Kind": {
"StringValue": "Hi! How can I help?"
}
},
"type": {
"Kind": {
"StringValue": "message"
}
}
}
}
}
}
]
}
}
}
}
What I essentially need is for it to be serialized as such:
"payload": {
"messages": [
{
"title": "Hi! How can I help?",
"type": "message"
}
]
}
This can be solved using jsonpb.
package main
import (
"bytes"
"encoding/json"
"github.com/golang/protobuf/jsonpb"
)
func main() {
...
payload := qr.GetWebhookPayload()
b, marshaler := bytes.Buffer{}, jsonpb.Marshaler{}
if err := marshaler.Marshal(&b, payload.GetFields()["messages"]); err != nil {
// handle err
}
msgs := []interface{}{}
if err := json.Unmarshal(b.Bytes(), &msgs); err != nil {
// handle err
}
// msgs now populated
}

Resources