Google cloud datastore + golang + embedded entities - go

I've been working on a package for API dev facilitation, that validates input data (according to a schema structure, mapping each field name to a value valdator/formator etc) ... I am sad to see that datastore does not want my payload=map[string]interface{} ...
I have then been playing a bit with the PropertyLoadSaver interface, constructing a slice of properties depending on a struct's values. All alues are pointers (which datastore does not accept excep struct), and I use it in order to not store the zero value for a non-provided value, but ignore it if the pointer is nil ... 
It works pretty well, the problem come when I want to use embedded structs with pointer fields ... I thought I would just add a property with a name, and a value of type "entity" ... This entity has a nil key, and properties (the inner fields) ...
I thought this would let me handle a json POST binded to a struct like
Type Outer struct {
A *string
B *int
i *Inner
}
And
Type Inner struct {
C *bool
D *float64
}
... I would then fully use the power of nosql being schemaless and flexible, having entities that could have the optional property I or not, partially or totally filled (c and d could then also be optional) ...
This would be lighter, storing only the provided data and ignoring other properties (is the datastore GUI you can manually create entities of various forms) ... Retrieving lightweight struct with nil set to the pointers value while loading property if not set in the datastore entity retrieved from db, and thus not displaying to the user "bad" zero values, but not displaying at all what has not been provided and stored previously ...
In the gui of the datastore, manually creating an entity you can set a property of type "embedded entity" ... This is exactly what I am trying to do, but when adding a property to the property slice before saving (in the save method of the propertyloadsaver compatible struct)  of type "datastore.Entity" with nil key and propertylist matching the slice of properties of the inner struct ... I received this "invalid value for a property with name cinfo" ... ...
Any idea ?

Related

Deriving a hashmap/Btree from a strcut of values in rust

I am trying to do the following:
Deserialise a struct of fields into another struct of different format.
I am trying to do the transformation via an Hashmap as this seems to be the best suited way.
I am required to do this as a part of transformation of a legacy system, and this being one of the intermediary phases.
I have raised another question which caters to a subset of the same use-case, but do not think it gives enough overview, hence raising this question with more details and code.
(Rust converting a struct o key value pair where the field is not none)
Will be merging both questions once I figure out how.
Scenario:
I am getting input via an IPC through a huge proto.
I am using prost to deserialise this data into a struct.
My deserialised struct has n no of fields and can increase as well.
I need to transform this deserialised struct into a key,value struct. (shown ahead).
The incoming data, will mostly have a majority of null keys .i.e out of n fields, most likely only 1,2 have values at a given time.
Input struct after prost deserialisation:
Using proto3 so unable to define optional fields.
Ideally I would prefer a prost struct of options on every field. .i.e Option instead of string.
struct Data{
int32 field1,
int64 field2,
string field3,
...
}
This needs to be transformed to a genric struct as below for further use:
struct TransformedData
{
string Key
string Value
}
So,
Data{
field1: 20
field2: null
field3: null
field4: null
....
}
becomes
TransformedData{
key:"field1"
Value: "20"
}
Methods I have tried:
Method1
Add serde to the prost struct definiton and deserialise it into a map.
Loop over each item in a map to get values which are non-null.
Use these to create a struct
https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=9645b1158de31fd54976926c9665d6b4
This has its challenges:
Each primitve data type has its own default values and needs to be checked for.
Nested structs will result in object data type which needs to be handled.
Need to iterate over every field of a struct to determine non null values.
1&2 can be mitigated by setting an Optional field(yet to figure out how in proto3 and prost)
But I am not sure how to get over 3.
Iterating over a large struct to find non null fields is not scalable and will be expensive.
Method 2:
I am using prost reflects dynamic reflection to deserialise and get only specified value.
Doing this by ensuring each proto message has two extra fields:
proto -> signifying the proto struct being used when serializing.
key -> signifying the filed which has value when serializing.
let fd = package::FILE_DESCRIPTOR_SET;
let pool = DescriptorPool::decode(fd).unwrap();
let proto_message: package::GenericProto = prost::Message::decode(message.as_ref()).expect("error when de-serialising protobuf message to struct");
let proto = proto_message.proto.as_str();
let key = proto_message.key.as_str();
Then using key , I derive the key from what looks to be a map implemented by prost:
let message_descriptor = pool.get_message_by_name(proto).unwrap();
let dynamic_message = DynamicMessage::decode(message_descriptor, message.as_ref()).unwrap();
let data = dynamic_message.get_field_by_name(key).unwrap().as_ref().clone();
Here :
1.This fails when someone sends a struct with multiple fields filled.
2.This does not work for nested objects, as nested objects in turn needs to be converted to a map.
I am looking for the least expensive way to do the above.
I understand the solution I am looking for is pretty out there and I am taking a one size fits all approach.
Any pointers appreciated.

Go Struct data identifier/authenticity

How to make a struct with fields and values unique for the data it holds?, The order of fields doesn't matter, what matter is the values of struct's fields must be exactly the same in order to be authenticated/identified.
Currently I used SHA256 hashing method. I hash the struct with the said data. For the next incoming data with same struct, I hash again to verify against previously hashed data to check if it was existed before.
So, let's say:
type F struct {
A string
B string
C interface{}
}
The value of C can be arbitrary, can be simple types(string, int,etc), map, or struct(e.g. nested json). How to make every data passed to F struct made unique. Have I already doing right using SHA256 on the struct?, I'm worry about the C value might affect the value of hash.

Any down-side always using pointers for struct field types?

Originally I figured I'd only use pointers for optional struct fields which could potentionally be nil in cases which it was initially built for.
As my code evolved I was writing different layers upon my models - for xml and json (un)marshalling. In these cases even the fields I thought would always be a requirement (Id, Name etc) actually turned out to be optional for some layers.
In the end I had put a * in front of all the fields including so int became *int, string became *string etc.
Now I'm wondering if I had been better of not generalising my code so much? I could have duplicated the code instead, which I find rather ugly - but perhaps more efficient than using pointers for all struct fields?
So my question is whether this is turning into an anti-pattern and just a bad habbit, or if this added flexibility does not come at a cost from a performance point of view?
Eg. can you come up with good arguments for sticking with option A:
type MyStruct struct {
Id int
Name string
ParentId *int
// etc.. only pointers where NULL columns in db might occur
}
over this option B:
type MyStruct struct {
Id *int
Name *string
ParentId *int
// etc... using *pointers for all fields
}
Would the best practice way of modelling your structs be from a purely database/column perspective, or eg if you had:
func (m *MyStruct) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
var v struct {
XMLName xml.Name `xml:"myStruct"`
Name string `xml:"name"`
Parent string `xml:"parent"`
Children []*MyStruct `xml:"children,omitempty"`
}
err := d.DecodeElement(&v, &start)
if err != nil {
return err
}
m.Id = nil // adding to db from xml, there's initially no Id, until after the insert
m.Name = v.Name // a parent might be referenced by name or alias
m.ParentId = nil // not by parentId, since it's not created yet, but maybe by nesting elements like you see above in the V struct (Children []*ContentType)
// etc..
return nil
}
This example could be part of the scenario where you want to add elements from XML to the database. Here ids would generally not make sense, so instead we use nesting and references on name or other aliases. An Id for the structs would not be set until we got the id, after the INSERT query. Then using that ID we could traverse down the hierachy to the child elements etc.
This would allow us to have just 1 MyStruct, and use eg. different POST http request handler functions, depending if the call came from form input, or xml importing where a nested hierarchy and different relations might need come different handling.
In the end I guess what I'm asking is:
Would you be better off separating struct models for db, xml- and json operations (or whatever scenario that you can think of), than using struct field pointers all the way, so we can reuse the model for different, yet related stuff?
Apart from possible performance (more pointers = more things for the GC to scan), safety (nil pointer dereference), convenience (s.a = 2 vs s.a = new(int); *s.a = 42), and memory penalties (a bool is one byte, a *bool is four to eight), there is one thing that really bothers me in the all-pointer approach. It violates the Single responsibility principle.
Is the MyStruct you get from XML or DB same as MyStruct? What if the DB schema will change? What if the XML changes format? What if you'll also need to unmarshal it into JSON, but in a slightly different manner? And what if you need to support all that (and in multiple versions!) at the same time?
A lot of pain comes to you when you try to make one thing do many things. Is having one do-it-all type instead of N specialised types really worth it?

Empty or not required struct fields

I have two structs that represent models that will be inserted into a mongodb database. One struct (Investment) has the other struct (Group) as one of its fields.
type Group struct {
Base
Name string `json:"name" bson"name"`
}
type Investment struct {
Base
Symbol string `json:"symbol" bson:"symbol" binding:"required"`
Group Group `json:"group" bson:"group"`
Fields bson.M `json:"fields" bson:"fields"`
}
The problem I'm having is that in the Investment model, Group is not required. If there is no group, I think its better for it to not be inserted in the db. Whats the best way to handle a db model such as this in Go?
tl;dr: Use ,omitempty, and if you need to worry about the difference between a zero value and null/not specified, do what the GitHub API does and use a pointer.
Both json and bson support the ,omitempty tag. For json, "empty values are false, 0, any nil pointer or interface value, and any array, slice, map, or string of length zero" (json docs). For bson, ,omitempty means "Only include the field if it's not set to the zero value for the type or to empty slices or maps", and zero values include empty strings and nil pointers (bson docs).
So if you really need a Group struct, you can put a *Group in instead, and it won't be stored when the pointer is nil. If Investment only needs to hold the group's name, it's even simpler: "" as group name keeps a group key from being stored.
bson defaults to using the lowercased field name already so you can omit that from the struct tag when they match. json will default to the Capitalized name, so specify the lowercase name in a tag if you need lowercase.
So, best case, maybe you can just use:
type Investment struct {
Base
Symbol string `json:"symbol" binding:"required"`
Group string `json:"group,omitempty" bson:",omitempty"`
Fields bson.M `json:"fields"`
}
If you ever run into fields where the zero value for the type ("", 0, false, etc.) is distinct from "not specified", you can do what the GitHub API does and put pointers in your structures--essentially an extension of the *Group trick.
Avoid strut fields to marshal if they are empty -
A struct field may be primitive type(string, int, bool etc) or even an another struct type.
So sometimes we don't want a struct's field to
go in json data(may to database insertion or in external api call) if they are empty
Example:
type Investment struct {
Base
Symbol string `json:"symbol" bson:"symbol" binding:"required"`
Group Group `json:"group" bson:"group"`
Fields bson.M `json:"fields" bson:"fields"`
}
If we want that Symbol and Group might contain empty values(0, false, nil pointer, zero size interface/struct) then we can avoid them in json marshaling like below.
type Investment struct {
Base
Symbol string `json:"symbol,omitempty" bson:"symbol,omitempty" binding:"required"`
Group *Group `json:"group,omitempty" bson:"group,omitempty"`
Fields bson.M `json:"fields" bson:"fields"`
}
Her "Group" field is pointer to Group struct and whenever it will point to nil pointer it will be omitted from json marshaling.
And obviously we would be filling values in Group field like below.
// declared investment variable of type Investment struct
investment.Group = &groupData

Properly distinguish between not set (nil) and blank/empty value

Whats the correct way in go to distinguish between when a value in a struct was never set, or is just empty, for example, given the following:
type Organisation struct {
Category string
Code string
Name string
}
I need to know (for example) if the category was never set, or was saved as blank by the user, should I be doing this:
type Organisation struct {
Category *string
Code *string
Name *string
}
I also need to ensure I correctly persist either null or an empty string to the database
I'm still learning GO so it is entirely possible my question needs more info.
The zero value for a string is an empty string, and you can't distinguish between the two.
If you are using the database/sql package, and need to distinguish between NULL and empty strings, consider using the sql.NullString type. It is a simple struct that keeps track of the NULL state:
type NullString struct {
String string
Valid bool // Valid is true if String is not NULL
}
You can scan into this type and use it as a query parameter, and the package will handle the NULL state for you.
Google's protocol buffers (https://code.google.com/p/goprotobuf/) use pointers to describe optional fields.
The generated objects provide GetFoo methods which take the pain away from testing for nil (a.GetFoo() returns an empty string if a.Foo is nil, otherwise it returns *a.Foo).
It introduces a nuisance when you want to write literal structs (in tests, for example), because &"something" is not valid syntax to generate a pointer to a string, so you need a helper function (see, for example, the source code of the protocol buffer library for proto.String).
// String is a helper routine that allocates a new string value
// to store v and returns a pointer to it.
func String(v string) *string {
return &v
}
Overall, using pointers to represent optional fields is not without drawbacks, but it's certainly a viable design choice.
The standard database/sql package provides a NullString struct (members are just String string and Valid bool). To take care of some of the repetitive work of persistence, you could look at an object-relational manager like gorp.
I looked into whether there was some way to distinguish two kinds of empty string just out of curiosity, and couldn't find one. With []bytes, []byte{} == []byte(nil) currently returns false, but I'm not sure if the spec guarantees that to always remain true. In any case, it seems like the most practical thing to do is to go with the flow and use NullString.

Resources