Displaying computed data with external dependencies - spring

I'm building a report that needs to include an 'estimate' column, which is based on data that's not available in the dataset.
Ideally I'd like to be able to define a Java interface
public int getEstimate(int foo_id, int bar_id, int quantity);
where foo_id, bar_id and quantity are available in the row I want the estimate presented.
There will be multiple strategies for producing the estimate so it would be good to use an interface to allow swapping them when needed.
Looking at the BIRT docs, I think it's possible I ought to be using the event handler mechanisms, but that seems to only allow defining a class to use and I'd somehow like to inject a configured estimator.
A non-obfuscated example might be to say that I have a dataset which includes an IP address column, and I'd like to be able to use some GeoIP service to resolve the country from the IP address. In that case I'd have an interface public String getCountryName(String address) and the actual implementations may use MaxMind, a local cache or some other system.
How would I go about doing this?
Or.. would I be better off by writing a scripted data source that can integrate the computed data before delivering it to BIRT?
Or.. some sort of scripted data source that is then used to create a join data set?

I think a Scripted Data Source would work fine, but a Java-based event handler would be more straightforward. You can implement it as a simple POJO and get access to any and all the complex objects and tools that will allow you to calculate your estimate. The simplest solution of all may simply to be adding a calculated field to the data set.
When creating the calculated field, you can get pretty complex in terms of the scripting logic you can leverage in order to produce the resultant value. The nicest thing about this route is that all the other column values in the row (which I assume you need to calculate the estimate) are made available via the Expression editor. You can pull in complex objects (POJOs) to help in your calculations here as well by using the "Packages" object (i.e. var red = new Packages.redwood.HelloWorld())
If you want to create the Event Handler class, here is what I would do. I would create a text object and bind the onCreate even to your POJO (by extending the TextItemEventAdapter) and override the "onCreate" method. There you can do any work you want to and at the end simply call 'text.setText(theEstimateResult);' to make the estimate itself visible. As far as accessing data values to do your calculations, You can get to those in the POJO too. I assume the estimate will be a part of a larger table of values. You can access any specific row value via the reportContext.
Those are the two ideas I would give a try first. The computed column is the fastest to implement and the least likely to throw you a curve during deployment. Let me know which way you choose and we can hash it out further if needed.

Related

How to model updated items with UICollectionViewDiffableDataSource

I'm struggling to understand how to use UICollectionViewDiffableDataSource and NSDiffableDataSourceSnapshot to model change of items.
Let's say I have a simple item which looks like this:
struct Item {
var id: Int
var name: String
}
Based on the names of the generic parameters, UICollectionViewDiffableDataSource and NSDiffableDataSourceSnapshot should operate not with Item itself, but only with identifier, which Int in this example.
On the other hand, again based on names of generic parameters, UICollectionView.CellRegistration should operate on complete Item's. So my guess is that UICollectionViewDiffableDataSource.CellProvider is responsible for finding complete Item's by id. Which is unfortunate, because then aside from snapshots, I need to maintain a separate storage of items. And there is a risk that this storage may go out of sync with snapshots.
But it is still not clear to me how do I inform UICollectionViewDiffableDataSource that some item changed its name without changing its id. I want UICollectionView to update the relevant cell and animate change in content size, but I don't want insertion or removal animation.
There are two approaches that would work solve your problem in this scenario.
The first is to conform your Item model to the hashable protocol. This would allow you to use the entire model as an identifier, and the cell provider closure would pass you an object of type Item. UICollectionViewDiffableDataSource would use the hash value for each instance of your model (which would consider both the id and name properties, thereby solving your name changing issue) to identify the data for a cell. This is better than trying to trick the collection view data source into considering only the id as the identifier because, as you stated, other aspects of the model might change. The whole point of structs is to act as a value-type, where the composition of all the model's properties determine its 'value'...no need to trick the collection view data source into looking only at Item.id.
Do as you said, and create a separate dictionary in which you can retrieve the Items based on their id's. While it is slightly more work to maintain a dictionary, it is a fairly trivial difference in terms of lines of code. All you should do is dump and recalculate the dictionary every time you apply a new snapshot. In this case, to update a cell when the model changes, make sure to swap out the model in your dictionary and call reloadItem on your snapshot.
While the second option is generally my preferred choice because the point of diffable data source is to allow for the handling of massive data sets by only concerning the data source with a simple identifier for each item, in this case your model is so simple that there's really no concern about wasted compute time calculating hash values, etc. If you think your model is likely to grow over time, I would probably go with the dictionary approach.

Trying to identify if a data injection method has a name already

Lets say we have a class "Car" than has different pieces of data ( maker, model, color, fabrication date, registration date, etc). The class has no method to get data, but it knows to as for it from another object (sent via constructor, let's cal it for short DS).- and the same for when needing to update changes.
A method getColor() would be implemented like this
if(! this->loaded('color')){
this->askDS('color') // this will do the necesarry work to generate a request to DS
}
return this->information('color');
Nothing too fancy so far. No comes the part i want to find out if it has a name, or if there are libraries / frameworks that do this already.
DS has a list of methods registered dinamically based on the class that needs data. For car we have:
input: car serial number, output: method to use to read the numbers to extract raw values
input: car raw color value, output: color code
input: car color code, manufacturer, year, mode, output:human-readable color (for example navy blue)
Now, DS or any method does not have an ordered list of using command to start from serial number and return the color blue, but if can construct a chain of methods that from one set of data, it can run them in order and get the desired data.
For our example above, DS runs 1,2,3 in that order and injects the data resulted from all methods into the class object that needed it.
Now if the car needs registration info, we have method (4) that gets that from the police database with an api request.
So, given:
- a type of model (class/object)
- a list of methods that take a fixed list of input(object properties) and give out a fixed list of output (object properties)
- a class DS that can glue the methods and run the needed ones for a model to get from property A (serial) to properby B (human readable colour) without the model or DS having a preconfigured way to get this data but finding it as needed.
does this have a name or is it already implemented somewhere ?
I've implemented a very basic prototype and it works very nice and i think this implementation method has useful features:
if you have a set of methods that do sql queries and then your app switches to using an api, you only need to change the methods and don't have to touch any other part of the application
when looking for a chain of methods that resolve the 'need' the object has, you can find a method chain, run it, if it fails keep looking for another list of methods based on the currently available data - so if you have multiple sources for a piece of data, it can try multiple versions
starting from the above paragraph i could start with an app that only has sql queries for data retrieval - when i find out a part of the app overloads the sql server i could add a method to retrieve data from cache with a lower cost than the one from database (or multiple layered caches, each with different costs)
i could probably add business logi in the mix the same ways as cache, and based on the user location / options present different data
this requires less coding overall, and decouples the data source from the object, making each piece easier to mock/test
what is needed to make this fast is a caching solution for the discovered method chains, since matching hundreds of thousands of methods per model type would be time-consuming but I don't think this is very hard to do - just store all found chains in memory as you find them and some metadata to be able to resume a search from any point in time - when you update the methods, just clear the cache, take a performance hit for the first requests
Thank you for your time
What you describe sounds like a somewhat roundabout way of doing Dependency Injection. Quote:
"Passing the service to the client, rather than allowing a client to
build or find the service, is the fundamental requirement of the
pattern."
Depending on what language you're using, there should be several Dependency Injection frameworks/libraries available.

Best practice with coding system values

I think this should be an easy one, but haven't found any clear answer, on what would the best practice be.
In an application, we keep current status of an order (open, canceled, shipped, closed ...).
This variables cannot change without code change, but application should meet the following criteria:
status names should be easily displayed in different languages,
application can search via freetext status names (like googling for "open")
status_id should be available to developer via enum
zero headache when adding new statuses
Possible ways we have tackled this so far:
having DB table status with PK(id, language_id) and a separate enum which represents this statuses in an application.
PROS: 1.,2.,3. work out of the box, CONS: 4. needs to run update script on every client installation, SQL selects can become large and cumbersome, when dealing with a lot of code tables
having just enum:
PROS: 3.,4. CONS: 1.,2. is a total nightmare
having enums, which populate database tables on each start of an application:
PROS: 1.,2.,3.,4. work CONS: some overhead on application start, SQL select can become large and cumbersome, when dealing a lot code tables.
What is the most common way of tackling this problem?
Sounds like you summarized it pretty good yourself, and comparing the pros/cons points towards #3. Just one comment when you implement #3 though:
Use a caching mechanism (even a simple HashMap!) plus adding the option to refresh the cache - will ease your work when you'll want to change values (without the need to restart every time!).
I would, and do, use method 3 because it is the best of the lot. You can use resource files to store the translations in and map the enum values to keys in the resource files. Your database can contain the id of the enum for the status.
1.status names should be easily displayed in different languages,
2.application can search via freetext status names (like googling for "open")
These are interfaces layer's concern, you'd better not mix them in you domain model.
I would setup a mapping between status enum and i18n codes. the mapping could be stored in a file (cached in memory) or hardcoded.
for example: if you use dto or view adatper to render your ui.
public class OrderDetailViewAdapter {
private Order order;
public String getStatus() {
return i18nMapper.to(order.getStatus());//use hardcoded switch case or file impl
}
}
Or you could done this before you populating you dtos.
You could use a similar solution for goal2. When user types text, find corresponding enum from mapping and use enum for search.
Anyway, use db tables the less the better.
Personally, I always use dedicated enum class inside domain. Only responsibility of this class is holding status name (OPEN, CANCELED, SHIPPED, ...). Status name is not visible outside codebase. Also, status could be also stored inside database field as string (varchar or similar).
For the purpose of rendering, depending of number of use cases, sometimes I implement formatting inside formatter (e.g. OrderFormatter::formatStatusName(), OrderFormatter::formatAbbreviatedStatusName(), ...). If formatting is needed often I create dedicated class with all formatting styles needed (OrderStatusFormatter::short(), OrderStatusFormatter::abbriviated()...). Of course, internal mapping is needed to map status name to status title, and this is tricky part. But if you want layering you can't avoid mapping.
Translation is not dealt so far. I translate strings inside templates so formatters are clean of that responsibility. To summarize:
enum inside domain model
formatter inside presentation layer
translation inside template
There is no need to create special table for order status translations. Better choice would be to implement generic translation mechanism, seperated from your business code.

Subclassing QAbstractTableModel for a large dataset via HTML/Ajax?

I'm trying to display a large table (a playlist with title, composer, etc... so I can't use a QListWidget) via QTableView with a sub-class of QAbstractTableModel. The call you have to override retrieving the data looks like this:
QVariant data(const QModelIndex &index, int role = Qt::DisplayRole) const;
This functions is called everytime for every cell (specified by index.row() & index.colum()). Translating that 1:1 to a HTML/Ajax would even kill performance with a local network.
So what are my options here? This must be possible because QSqlQueryModel exists and they must have the same problem. Googling for a combination of Ajax/QAbstractTableModel returned nothing at all.
Any ideas?
PS: To semi-answer myself, looking at the QT-Sources src/sql/models/qsqlquerymodel.cpp reveals the answer. It's possible but I'm wondering if someone knows a of-out-the-box solution.
In short yes you will want to utilize the concepts you have learned by looking at the source for QSqlQueryModel or some derivative. It can be thought that in the example the d is really the true model storage containing the data (in their case it contains the SQL query with the data and indexes for determining if a fetches has to occur) and the QSqlQueryModel is just acting as a proxy class on top of it for integration with Qt's Model/View concept.
We have a similar situation here at the office where querying for the data cell by cell would introduce large inefficiencies. Therefore we created a class that efficiently handles all the requests across the network and abstracts it away. Then we use a pretty simple QAbstractTableModel-derived class to interact with the true model class to provide the data as Qt requires it. It works very well.
Just ensure that as you receive or remove data that you are properly manipulating the model data between beginInsertRows/endInsertRows and beginRemoveRows/endRemoveRows respectively.

Appropriate data structure for flat file processing?

Essentially, I have to get a flat file into a database. The flat files come in with the first two characters on each line indicating which type of record it is.
Do I create a class for each record type with properties matching the fields in the record? Should I just use arrays?
I want to load the data into some sort of data structure before saving it in the database so that I can use unit tests to verify that the data was loaded correctly.
Here's a sample of what I have to work with (BAI2 bank statements):
01,121000358,CLIENT,050312,0213,1,80,1,2/
02,CLIENT-STANDARD,BOFAGB22,1,050311,2359,,/
03,600812345678,GBP,fab1,111319005,,V,050314,0000/
88,fab2,113781251,,V,050315,0000,fab3,113781251,,V,050316,0000/
88,fab4,113781251,,V,050317,0000,fab5,113781251,,V,050318,0000/
88,010,0,,,015,0,,,045,0,,,100,302982205,,,400,302982205,,/
16,169,57626223,V,050311,0000,102 0101857345,/
88,LLOYDS TSB BANK PL 779300 99129797
88,TRF/REF 6008ABS12300015439
88,102 0101857345 K BANK GIRO CREDIT
88,/IVD-11 MAR
49,1778372829,90/
98,1778372839,1,91/
99,1778372839,1,92
I'd recommend creating classes (or structs, or what-ever value type your language supports), as
record.ClientReference
is so much more descriptive than
record[0]
and, if you're using the (wonderful!) FileHelpers Library, then your terms are pretty much dictated for you.
Validation logic usually has at least 2 levels, the grosser level being "well-formatted" and the finer level being "correct data".
There are a few separate problems here. One issue is that of simply verifying the data, or writing tests to make sure that your parsing is accurate. A simple way to do this is to parse into a class that accepts a given range of values, and throws the appropriate error if not,
e.g.
public void setField1(int i)
{
if (i>100) throw new InvalidDataException...
}
Creating different classes for each record type is something you might want to do if the parsing logic is significantly different for different codes, so you don't have conditional logic like
public void setField2(String s)
{
if (field1==88 && s.equals ...
else if (field2==22 && s
}
yechh.
When I have had to load this kind of data in the past, I have put it all into a work table with the first two characters in one field and the rest in another. Then I have parsed it out to the appropriate other work tables based on the first two characters. Then I have done any cleanup and validation before inserting the data from the second set of work tables into the database.
In SQL Server you can do this through a DTS (2000) or an SSIS package and using SSIS , you may be able to process the data onthe fly with storing in work tables first, but the prcess is smilar, use the first two characters to determine the data flow branch to use, then parse the rest of the record into some type of holding mechanism and then clean up and validate before inserting. I'm sure other databases also have some type of mechanism for importing data and would use a simliar process.
I agree that if your data format has any sort of complexity you should create a set of custom classes to parse and hold the data, perform validation, and do any other appropriate model tasks (for instance, return a human readable description, although some would argue this would be better to put into a separate view class). This would probably be a good situation to use inheritance, where you have a parent class (possibly abstract) define the properties and methods common to all types of records, and each child class can override these methods to provide their own parsing and validation if necessary, or add their own properties and methods.
Creating a class for each type of row would be a better solution than using Arrays.
That said, however, in the past I have used Arraylists of Hashtables to accomplish the same thing. Each item in the arraylist is a row, and each entry in the hashtable is a key/value pair representing column name and cell value.
Why not start by designing the database that will hold the data then you can use the entity framwork to generate the classes for you.
here's a wacky idea:
if you were working in Perl, you could use DBD::CSV to read data from your flat file, provided you gave it the correct values for separator and EOL characters. you'd then read rows from the flat file by means of SQL statements; DBI will make them into standard Perl data structures for you, and you can run whatever validation logic you like. once each row passes all the validation tests, you'd be able to write it into the destination database using DBD::whatever.
-steve

Resources