Deriving a hashmap/Btree from a strcut of values in rust - data-structures

I am trying to do the following:
Deserialise a struct of fields into another struct of different format.
I am trying to do the transformation via an Hashmap as this seems to be the best suited way.
I am required to do this as a part of transformation of a legacy system, and this being one of the intermediary phases.
I have raised another question which caters to a subset of the same use-case, but do not think it gives enough overview, hence raising this question with more details and code.
(Rust converting a struct o key value pair where the field is not none)
Will be merging both questions once I figure out how.
Scenario:
I am getting input via an IPC through a huge proto.
I am using prost to deserialise this data into a struct.
My deserialised struct has n no of fields and can increase as well.
I need to transform this deserialised struct into a key,value struct. (shown ahead).
The incoming data, will mostly have a majority of null keys .i.e out of n fields, most likely only 1,2 have values at a given time.
Input struct after prost deserialisation:
Using proto3 so unable to define optional fields.
Ideally I would prefer a prost struct of options on every field. .i.e Option instead of string.
struct Data{
int32 field1,
int64 field2,
string field3,
...
}
This needs to be transformed to a genric struct as below for further use:
struct TransformedData
{
string Key
string Value
}
So,
Data{
field1: 20
field2: null
field3: null
field4: null
....
}
becomes
TransformedData{
key:"field1"
Value: "20"
}
Methods I have tried:
Method1
Add serde to the prost struct definiton and deserialise it into a map.
Loop over each item in a map to get values which are non-null.
Use these to create a struct
https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=9645b1158de31fd54976926c9665d6b4
This has its challenges:
Each primitve data type has its own default values and needs to be checked for.
Nested structs will result in object data type which needs to be handled.
Need to iterate over every field of a struct to determine non null values.
1&2 can be mitigated by setting an Optional field(yet to figure out how in proto3 and prost)
But I am not sure how to get over 3.
Iterating over a large struct to find non null fields is not scalable and will be expensive.
Method 2:
I am using prost reflects dynamic reflection to deserialise and get only specified value.
Doing this by ensuring each proto message has two extra fields:
proto -> signifying the proto struct being used when serializing.
key -> signifying the filed which has value when serializing.
let fd = package::FILE_DESCRIPTOR_SET;
let pool = DescriptorPool::decode(fd).unwrap();
let proto_message: package::GenericProto = prost::Message::decode(message.as_ref()).expect("error when de-serialising protobuf message to struct");
let proto = proto_message.proto.as_str();
let key = proto_message.key.as_str();
Then using key , I derive the key from what looks to be a map implemented by prost:
let message_descriptor = pool.get_message_by_name(proto).unwrap();
let dynamic_message = DynamicMessage::decode(message_descriptor, message.as_ref()).unwrap();
let data = dynamic_message.get_field_by_name(key).unwrap().as_ref().clone();
Here :
1.This fails when someone sends a struct with multiple fields filled.
2.This does not work for nested objects, as nested objects in turn needs to be converted to a map.
I am looking for the least expensive way to do the above.
I understand the solution I am looking for is pretty out there and I am taking a one size fits all approach.
Any pointers appreciated.

Related

Go Struct data identifier/authenticity

How to make a struct with fields and values unique for the data it holds?, The order of fields doesn't matter, what matter is the values of struct's fields must be exactly the same in order to be authenticated/identified.
Currently I used SHA256 hashing method. I hash the struct with the said data. For the next incoming data with same struct, I hash again to verify against previously hashed data to check if it was existed before.
So, let's say:
type F struct {
A string
B string
C interface{}
}
The value of C can be arbitrary, can be simple types(string, int,etc), map, or struct(e.g. nested json). How to make every data passed to F struct made unique. Have I already doing right using SHA256 on the struct?, I'm worry about the C value might affect the value of hash.

How to process pair array data

I have json data like:
`[{"fea1":12345},{"fea2":23456}]`
I want to unmarshal them into Go structs.
Now I defined a struct like []map[string]int.
It works, but I think this is not the best way which processing pair data with map structure.
Processing large data set also cost much resource if using map structure.
Is there a more graceful way to implement it?
If you have predefined set of fields, you may use struct like this:
type Fea struct {
Fea1 int `json:"fea1,omitempty"`
Fea2 int `json:"fea2,omitempty"`
}
type Feas []Fea
var feas Feas
Then Unmarshal to feas. This way present fields would be filled, others would be empty.

Getting error while access the struct type of array element as undefined (type []ParentIDInfo has no field or method PCOrderID)

I am new to golang and I have one issue which I think community can help me to solve it.
I have one data structure like below
type ParentIDInfo struct {
PCOrderID string `json:"PCorderId",omitempty"`
TableVarieties TableVarietyDC `json:"tableVariety",omitempty"`
ProduceID string `json:"PRID",omitempty"`
}
type PCDCOrderAsset struct {
PcID string `json:"PCID",omitempty"`
DcID string `json:"DCID",omitempty"`
RequiredDate string `json:"requiredDate",omitempty"`
Qty uint64 `json:"QTY",omitempty"`
OrderID string `json:"ORDERID",omitempty"`
Status string `json:"STATUS",omitempty"`
Produce string `json:"Produce",omitempty"`
Variety string `json:"VARIETY",omitempty"`
Transports []TransportaionPCDC `json:"Transportaion",omitempty"`
ParentInfo []ParentIDInfo `json:"ParentInfo",omitempty"`
So I have issue to access the PCOrderID which inside the []ParentIDInfo . I have tried below however I getting error as "pcdcorder.ParentInfo.PCOrderID undefined (type []ParentIDInfo has no field or method PCOrderID)"
keyfarmercas = append(keyfarmercas, pcdcorder.ParentInfo.PCOrderID)
Any help will be very good
Thanks in advance
PCDCOrderAsset.ParentInfo is not a struct, it does not have a PCOrderID field. It's a slice (of element type ParentIDInfo), so its elements do, e.g. pcdcorder.ParentInfo[0].PCOrderID.
Whether this is what you want we can't tell. pcdcorder.ParentInfo[0].PCOrderID gives you the PCOrderID field of the first element of the slice. Based on your question this may or may not be what you want. You may want to append all IDs (one from each element). Also note that if the slice is empty (its length is 0), then pcdcorder.ParentInfo[0] would result in a runtime panic. You could avoid that by first checking its length and only index it if its not empty.
In case you'd want to add ids of all elements, you could use a for loop to do that, e.g.:
for i := range pcdorder.ParentInfo {
keyfarmercas = append(keyfarmercas, pcdcorder.ParentInfo[i].PCOrderID)
}

How to convert one struct to another in Go when one includes another?

I would like to know if there is easy way to convert from one struct to another in Go when one struct includes the other.
For example
type Type1 struct {
Field1 int
Field2 string
}
type Type2 struct {
Field1 int
}
I know that it can be handled like this
var a Type1{10, "A"}
var b Type2
b.Field1 = a.Field1
but if there are many fields, I will have to write numerous assignments. Is there any other way to handle it without multiple assignments?
In a word, is there anything like b = _.omit(a, 'Field2') in javascript?
Not directly, no. You can freely convert between identical types only.
You can get various levels of solutions to this type of problem:
writing the assignments out yourself (likely the best performance)
using reflection to copy from one to the other based on field names
something quick-and-dirty like marshalling one type to JSON then unmarshalling to the other type (which is basically using reflection under the hood with a plaintext middleman, so it's even less efficient, but can be done with little work on your part)

Properly distinguish between not set (nil) and blank/empty value

Whats the correct way in go to distinguish between when a value in a struct was never set, or is just empty, for example, given the following:
type Organisation struct {
Category string
Code string
Name string
}
I need to know (for example) if the category was never set, or was saved as blank by the user, should I be doing this:
type Organisation struct {
Category *string
Code *string
Name *string
}
I also need to ensure I correctly persist either null or an empty string to the database
I'm still learning GO so it is entirely possible my question needs more info.
The zero value for a string is an empty string, and you can't distinguish between the two.
If you are using the database/sql package, and need to distinguish between NULL and empty strings, consider using the sql.NullString type. It is a simple struct that keeps track of the NULL state:
type NullString struct {
String string
Valid bool // Valid is true if String is not NULL
}
You can scan into this type and use it as a query parameter, and the package will handle the NULL state for you.
Google's protocol buffers (https://code.google.com/p/goprotobuf/) use pointers to describe optional fields.
The generated objects provide GetFoo methods which take the pain away from testing for nil (a.GetFoo() returns an empty string if a.Foo is nil, otherwise it returns *a.Foo).
It introduces a nuisance when you want to write literal structs (in tests, for example), because &"something" is not valid syntax to generate a pointer to a string, so you need a helper function (see, for example, the source code of the protocol buffer library for proto.String).
// String is a helper routine that allocates a new string value
// to store v and returns a pointer to it.
func String(v string) *string {
return &v
}
Overall, using pointers to represent optional fields is not without drawbacks, but it's certainly a viable design choice.
The standard database/sql package provides a NullString struct (members are just String string and Valid bool). To take care of some of the repetitive work of persistence, you could look at an object-relational manager like gorp.
I looked into whether there was some way to distinguish two kinds of empty string just out of curiosity, and couldn't find one. With []bytes, []byte{} == []byte(nil) currently returns false, but I'm not sure if the spec guarantees that to always remain true. In any case, it seems like the most practical thing to do is to go with the flow and use NullString.

Resources