How to achieve dynamic custom fields of different data type using gRPC proto - protocol-buffers

Looking for a solution in gRPC protobuff to implement dynamic fields of different datatypes for an multi-tenant application.
Also there can be any number of dynamic fields based on tenant.
Using map in proto, I can define different set of map for each data type. Is there any optimized way to achieve this.
Any help on this is appreciated.

There are a few different ways of transferring dynamic content in protobuf. Which is ideal varies depending on your use case. The options are ordered by their dynamism. Less dynamic options normally have better performance.
Use google.protobuf.Any in proto3. This is useful when you want to store arbitrary protobuf messages and is commonly used to provide extension points. It replaces extensions from proto2. Any has a child message and its type, so your application can check at runtime if it understands the type. If your application does not know the type, then it can copy the Any but can't decode its contents. Any cannot directly hold scalar types (like int32), but each scalar has a wrapper message that can be used instead. Because each Any includes the type of the message as a string, it is poorly suited if you need lots of them with small contents.
Use the JSON mapping message google.protobuf.Value. This is useful when you want to store arbitrary schemaless JSON data. Because it does not need to store the full type of its contents, a Value holding a ListValue of number_values (doubles) will be more compact on-the-wire than repeated Any. But if a schema is available, an Any containing a message with repeated double will be more compact on-the-wire than Value.
Use a oneof that contains each permitted type. Commonly a new message type is needed to hold the oneof. This is useful when you can restrict the schema but values have a relationship, like if the position of each value in a list is important and the types in the list are mixed. This is similar to Value but lets you choose your own types. While technically more powerful than Value it is typically used to produce a more constrained data structure. It is equal to or more compact on-the-wire than Value. This requires knowing the needed types ahead-of-time. Example: map<string, MyValue>, where MyValue is:
message MyValue {
oneof kind {
int32 int_value = 1;
string string_value = 2;
}
}
Use a separate field/collection for each type. For each type you can have a separate field in a protobuf message. This is the approach you were considering. This is the most compact on-the-wire and most efficient in memory. You must know the types you are interested in storing ahead of time. Example: map<string, int32> int_values = 1; map<string, string> string_values = 2.

Related

How to unmarshal protbuf data into custom struct in Golang

I have a proxy service that translate protobuf into another struct. I
can just write some manual code to do that, but that is inefficient and boilerplate. I can also transform the protobuf data to JSON, and deserlize the JSON data into the destination struct, but the speed is slow and it is CPU heavy.
The Unmarshaler interface is now deprecated, and Message interface have internal types which I cannot implement in my project.
Is there a way I can do this now?
Psuedo code: basically, if Go's reflection supports setting and getting of struct / class fields by some sort of field identifier, then you can do this. Something like this in C# works, so long as the field types in the two classes are the same (because in C#, I'm doing object = object, which ends up being OK if they're the same actual type).
SourceStructType sourceStruct;
DestStructType destStruct;
foreach (Field sourceField in sourceStruct.GetType().GetFields())
{
Field destField = destStruct.GetType().FindFieldByName(sourceField.name);
destStruct.SetFieldValue(destField) = sourceStruct.GetFieldValue(sourceField);
}
If the structs are more complex - i.e. they have structs within them, then you'll have to recurse down into them. It can get fiddly, but once written you'll never have to write it ever again!

GraphQL custom scalar type for HTML structure

During her brilliant presentation about scaling GraphQL, Leanne Shapton showed some best practices.
One of the most attractive for me was the custom scalar type for HTML structure. On the video it's [10:16]
She proposed using the custom HTML instead of simple String.
I wish you could show your implementation of this scalar or how do you handle these cases as I'm using only String for any HTML structure which doesn't seem to be a perfect way.
I'm asking not for creating scalar types or general information what is it scalar type and so on. Wondering if someone else has HTML handling already and does someone has any working solutions
At a pure GraphQL level, the only thing you can (and must) do is include a definition for the scalar type:
scalar HTML
Once you done that, you can use it as a type as shown in the slide you cite. In queries and results it will appear as some sort of scalar (string or numeric) value.
Different server and client libraries have different ways of dealing with this; there may be a uniform way to map a specific GraphQL scalar type to a native-language object. In graphql-js, a GraphQLScalarType object takes parseValue and serialize functions to convert between the two representations, for example. If you're just using a custom scalar type as a tagged string these can be very simple functions.

Best way to validate and extend constructor parameters in Scala 2.10

I want to have a class that has a number of fields such as String, Boolean, etc and when the class is constructed I want to have a fieldname associated with each field and verify the field (using regex for strings). Ideally I would just like specify in the constructor that the parameter needs to meet certain criteria.
Some sample code of how :
case class Data(val name: String ..., val fileName: String ...) {
name.verify
// Access fieldName associated with the name parameter.
println(name.fieldName) // "Name"
println(fileName.fieldName) // "File Name"
}
val x = Data("testName", "testFile")
// Treat name as if it was just a string field in Data
x.name // Is of type string, does not expose fieldName, etc
Is there an elegant way to achieve this?
EDIT:
I don't think I have been able to get across clearly what I am after.
I have a class with a number of string parameters. Each of those parameters needs to validated in a specific way and I also want to have a string fieldName associated with each parameter. However, I want to still be able to treat the parameter as if it was just a normal string (see the example).
I could code the logic into Data and as an apply method of the Data companion object for each parameter, but I was hoping to have something more generic.
Putting logic (such as parameter validation) in constructors is dubious. Throwing exceptions from constructors is doubly so.
Usually this kind of creational pattern is best served with one or more factory methods or a builder of some sort.
For a basic factory, just define a companion with the factory methods you want. If you want the same short-hand construction notation (new-free) you can overload the predefined apply (though you may not replace the one whose signature matches the case class constructor exactly).
If you want to spare your client code the messiness of dealing with exceptions when validation fails, you can return Option[Data] or Either[ErrorIndication, Data] instead. Or you can go with ScalaZ's Validation, which I'm going to arbitrarily declare to be beyond the scope of this answer ('cause I'm not sufficiently familiar with it...)
However, you cannot have instances that differ in what properties they present. Not even subclasses can subtract from the public API. If you need to be able to do that, you'll need a more elaborate construct such as a trait for the common parts and separate case classes for the variants and / or extensions.

Displaying computed data with external dependencies

I'm building a report that needs to include an 'estimate' column, which is based on data that's not available in the dataset.
Ideally I'd like to be able to define a Java interface
public int getEstimate(int foo_id, int bar_id, int quantity);
where foo_id, bar_id and quantity are available in the row I want the estimate presented.
There will be multiple strategies for producing the estimate so it would be good to use an interface to allow swapping them when needed.
Looking at the BIRT docs, I think it's possible I ought to be using the event handler mechanisms, but that seems to only allow defining a class to use and I'd somehow like to inject a configured estimator.
A non-obfuscated example might be to say that I have a dataset which includes an IP address column, and I'd like to be able to use some GeoIP service to resolve the country from the IP address. In that case I'd have an interface public String getCountryName(String address) and the actual implementations may use MaxMind, a local cache or some other system.
How would I go about doing this?
Or.. would I be better off by writing a scripted data source that can integrate the computed data before delivering it to BIRT?
Or.. some sort of scripted data source that is then used to create a join data set?
I think a Scripted Data Source would work fine, but a Java-based event handler would be more straightforward. You can implement it as a simple POJO and get access to any and all the complex objects and tools that will allow you to calculate your estimate. The simplest solution of all may simply to be adding a calculated field to the data set.
When creating the calculated field, you can get pretty complex in terms of the scripting logic you can leverage in order to produce the resultant value. The nicest thing about this route is that all the other column values in the row (which I assume you need to calculate the estimate) are made available via the Expression editor. You can pull in complex objects (POJOs) to help in your calculations here as well by using the "Packages" object (i.e. var red = new Packages.redwood.HelloWorld())
If you want to create the Event Handler class, here is what I would do. I would create a text object and bind the onCreate even to your POJO (by extending the TextItemEventAdapter) and override the "onCreate" method. There you can do any work you want to and at the end simply call 'text.setText(theEstimateResult);' to make the estimate itself visible. As far as accessing data values to do your calculations, You can get to those in the POJO too. I assume the estimate will be a part of a larger table of values. You can access any specific row value via the reportContext.
Those are the two ideas I would give a try first. The computed column is the fastest to implement and the least likely to throw you a curve during deployment. Let me know which way you choose and we can hash it out further if needed.

Appropriate data structure for flat file processing?

Essentially, I have to get a flat file into a database. The flat files come in with the first two characters on each line indicating which type of record it is.
Do I create a class for each record type with properties matching the fields in the record? Should I just use arrays?
I want to load the data into some sort of data structure before saving it in the database so that I can use unit tests to verify that the data was loaded correctly.
Here's a sample of what I have to work with (BAI2 bank statements):
01,121000358,CLIENT,050312,0213,1,80,1,2/
02,CLIENT-STANDARD,BOFAGB22,1,050311,2359,,/
03,600812345678,GBP,fab1,111319005,,V,050314,0000/
88,fab2,113781251,,V,050315,0000,fab3,113781251,,V,050316,0000/
88,fab4,113781251,,V,050317,0000,fab5,113781251,,V,050318,0000/
88,010,0,,,015,0,,,045,0,,,100,302982205,,,400,302982205,,/
16,169,57626223,V,050311,0000,102 0101857345,/
88,LLOYDS TSB BANK PL 779300 99129797
88,TRF/REF 6008ABS12300015439
88,102 0101857345 K BANK GIRO CREDIT
88,/IVD-11 MAR
49,1778372829,90/
98,1778372839,1,91/
99,1778372839,1,92
I'd recommend creating classes (or structs, or what-ever value type your language supports), as
record.ClientReference
is so much more descriptive than
record[0]
and, if you're using the (wonderful!) FileHelpers Library, then your terms are pretty much dictated for you.
Validation logic usually has at least 2 levels, the grosser level being "well-formatted" and the finer level being "correct data".
There are a few separate problems here. One issue is that of simply verifying the data, or writing tests to make sure that your parsing is accurate. A simple way to do this is to parse into a class that accepts a given range of values, and throws the appropriate error if not,
e.g.
public void setField1(int i)
{
if (i>100) throw new InvalidDataException...
}
Creating different classes for each record type is something you might want to do if the parsing logic is significantly different for different codes, so you don't have conditional logic like
public void setField2(String s)
{
if (field1==88 && s.equals ...
else if (field2==22 && s
}
yechh.
When I have had to load this kind of data in the past, I have put it all into a work table with the first two characters in one field and the rest in another. Then I have parsed it out to the appropriate other work tables based on the first two characters. Then I have done any cleanup and validation before inserting the data from the second set of work tables into the database.
In SQL Server you can do this through a DTS (2000) or an SSIS package and using SSIS , you may be able to process the data onthe fly with storing in work tables first, but the prcess is smilar, use the first two characters to determine the data flow branch to use, then parse the rest of the record into some type of holding mechanism and then clean up and validate before inserting. I'm sure other databases also have some type of mechanism for importing data and would use a simliar process.
I agree that if your data format has any sort of complexity you should create a set of custom classes to parse and hold the data, perform validation, and do any other appropriate model tasks (for instance, return a human readable description, although some would argue this would be better to put into a separate view class). This would probably be a good situation to use inheritance, where you have a parent class (possibly abstract) define the properties and methods common to all types of records, and each child class can override these methods to provide their own parsing and validation if necessary, or add their own properties and methods.
Creating a class for each type of row would be a better solution than using Arrays.
That said, however, in the past I have used Arraylists of Hashtables to accomplish the same thing. Each item in the arraylist is a row, and each entry in the hashtable is a key/value pair representing column name and cell value.
Why not start by designing the database that will hold the data then you can use the entity framwork to generate the classes for you.
here's a wacky idea:
if you were working in Perl, you could use DBD::CSV to read data from your flat file, provided you gave it the correct values for separator and EOL characters. you'd then read rows from the flat file by means of SQL statements; DBI will make them into standard Perl data structures for you, and you can run whatever validation logic you like. once each row passes all the validation tests, you'd be able to write it into the destination database using DBD::whatever.
-steve

Resources