What is the convention around derivative information? - protocol-buffers

I am working on a service that provides information about a few related entities, somewhat like a database. Suppose that there's calls to retrieve information about a school:
service MySchool {
rpc GetClassRoom (ClassRoomRequest) returns (ClassRoom);
rpc GetStudent (StudentRequest) returns (Student);
}
Now, suppose that I want to find out a class room's information, I'd receive a proto that looks like so:
message ClassRoom {
string id = 1;
string address = 2;
string teacher = 3;
}
Sometimes I also want to know all of the students of the classroom. I am struggling to think which is the better design pattern.
Option A) Add an extra rpc like so: rpc GetClassRoomStudents (ClassRoomRequest) returns (ClassRoomStudents), where ClassRoomStudents has a single field repeated Student students. This technique requires more than one call to get all the information that we want (and many if we wanted to know information for more than one classroom).
Option B) Add an extra repeated Student students field to the ClassRoom proto, and B') Fill it up only when necessary, or B") Fill it up whenever the server receives a GetClassRoom call. This may sometimes fetch extra information, or lead to ambiguity according to what fields are filled up.
I am not sure what's the best / most conventional way of dealing with this. How have some of you dealt with this?

There is no simple answer. It's a tradeoff between simplicity (option A) and performance (option B), and it depends on the situation which solution is best.
In general, I'd recommend to go with the simple solution first, unless your measurements demonstrate that it leads to performance issues. At that point, it's easy to add repeated Student students to ClassRoom and a field bool fetch_students [default=false] to ClassRoomRequest. Then clients are free to continue using the simple API, or choose to upgrade to the more performant API if they need to.
Note that this isn't specific to gRPC; the same issue is seen in REST APIs, and basically almost any request/response model.

Related

d3.queue is not giving an output

This is the page I am trying to make it dynamic by enabling cross-filtering.
So the thing is they are having multiple API.
For the top first two: TOTAL CASES & DAILY CASES
They are using this API and the third one in the top is based on this API.
The bottom three AGE, GENDER, and NATIONALITY are from this API.
In all the API one thing is common that is a date but there are some API in which some data are missing for few dates like there is a gap( Not available for some of the dates).
So I thought of combining all the JSON API in terms of dates and then allow cross filter because I believe I can enable cross-filtering between them. Correct me If I am wrong.
Like If I click on gender female since it gives info about total cases where the patient was female so only confirmed cases from the Total cases will change not the recovered, deaths as data is not available. SO I guess I should combine the top 3 charts together and gender, age and nationality charts, together. Then Dc js would be able to handle nicely filtering between each segments (cases related to landmark, cases related to person info).
Line 123:
var log = console.log;
var q = queue()
.defer(d3.json, "https://api.covid19india.org/data.json")
.defer(d3.json, "https://api.rootnet.in/covid19-in/unofficial/covid19india.org/statewise/history");
q.await(function(error, data1, data2) {
log("==========>");
log("data1:", error,data1);
log("data2:", data2);
});
This is not working because I can't see console.log() output.
https://blockbuilder.org/ninjakx/8c48ab6481311aa0452046d66c4d8701
So my questions are:
1) Why d3.queue is not working?
2) Suggestion whether combining all the datas together and allowing a filltering is a good idea or not as there is limited data. Should I go for cross filtering between the same api charts. So in this case I will have 2 segments (cases related to landmark, cases related to person info)..
Using DC js I want to make it more interactive and display more info.
d3.queue is obsolete
The answer to your first question is cut-and-dried: you don't need d3.queue, and it was deprecated and removed in d3#5.
As of d3#5, D3's data loading APIs use ES6 Promises instead of asynchronous callbacks, so you can use Promise.all([...]) instead of d3.queue. Apparently no way to make the new API emit errors when called in the old way, so it just fails silently. :-/
The new way to write your code is
Promise.all([
d3.json("https://api.covid19india.org/data.json"),
d3.json("https://api.rootnet.in/covid19-in/unofficial/covid19india.org/statewise/history")
]).then(([data1,data2]) => {
log("==========>");
log("data1:", data1);
log("data2:", data2);
})
.catch(error => log('error', error))
I find this much easier to read and understand. A nice side effect is that if you neglect to do error handling (like most people), you'll automatically get a clear message in the log.
Working fork of your block.
Combining multiple data sets
Your second question is pretty open-ended, maybe it would be better to bring that to the dc.js users group?
In general, it's difficult to cross-filter more than one data set. You would have more than one chart group that redraws together, and you'd have to manually add handlers on some chart to initiate, clear filters, and redraw the other chart group.
I haven't seen too many dashboards that do this. You'd have to make it clear to users what is going on.

graphql- same query with different arguments

Can the below be achieved with graph ql:
we have getusers() / getusers(id=3) / getusers(name='John). Can we use same query to accept different parameters (arguments)?
I assume you mean something like:
type Query {
getusers: [User]!
getusers(id: ID!): User
getusers(name: String!): User
}
IMHO the first thing to do is try. You should get an error saying that Query.getusers can only be defined once, which would answer your question right away.
Here's the actual spec saying that such a thing is not valid: http://facebook.github.io/graphql/June2018/#example-5e409
Quote:
Each named operation definition must be unique within a document when
referred to by its name.
Solution
From what I've seen, the most GraphQL'y way to create such an API is to define a filter input type, something like this:
input UserFilter {
ids: [ID]
names: [String]
}
and then:
type Query {
users(filter: UserFilter)
}
The resolver would check what filters were passed (if any) and query the data accordingly.
This is very simple and yet really powerful as it allows the client to query for an arbitrary number of users using an arbitrary filter. As a back-end developer you may add more options to UserFilter later on, including some pagination options and other cool things, while keeping the old API intact. And, of course, it is up to you how flexible you want this API to be.
But why is it like that?
Warning! I am assuming some things here and there, and might be wrong.
GraphQL is only a logical API layer, which is supposed to be server-agnostic. However, I believe that the original implementation was in JavaScript (citation needed). If you then consider the technical aspects of implementing a GraphQL API in JS, you might get an idea about why it is the way it is.
Each query points to a resolver function. In JS resolvers are simple functions stored inside plain objects at paths specified by the query/mutation/subscription name. As you may know, JS objects can't have more than one path with the same name. This means that you could only define a single resolver for a given query name, thus all three getusers would map to the same function Query.getusers(obj, args, ctx, info) anyway.
So even if GraphQL allowed for fields with the same name, the resolver would have to explicitly check for whatever arguments were passed, i.e. if (args.id) { ... } else if (args.name) { ... }, etc., thus partially defeating the point of having separate endpoints. On the other hand, there is an overall better (particularly from the client's perspective) way to define such an API, as demonstrated above.
Final note
GraphQL is conceptually different from REST, so it doesn't make sense to think in terms of three endpoints (/users, /users/:id and /users/:name), which is what I guess you were doing. A paradigm shift is required in order to unveil the full potential of the language.
a request of the type works:
Query {
first:getusers(),
second:getusers(id=3)
third:getusers(name='John)
}

How can the User Interface know which commands is allowed to perform against an Aggregate Root?

The UI is decoupled from the domain, but the UI should try its best to never allow the user to issue commands that are sure to fail.
Consider the following example (pseudo-code):
DiscussionController
#Security(is_logged)
#Method('POST')
#Route('addPost')
addPostToDiscussionAction(request)
discussionService.postToDiscussion(
new PostToDiscussionCommand(request.discussionId, session.myUserId, request.bodyText)
)
#Method('GET')
#Route('showDiscussion/{discussionId}')
showDiscussionAction(request)
discussionWithAllThePosts = discussionFinder.findById(request.discussionId)
canAddPostToThisDiscussion = ???
// render the discussion to the user, and use `canAddPostToThisDiscussion` to show/hide the form
// from which the user can send a request to `addPostToDiscussionAction`.
renderDiscussion(discussionWithAllThePosts, canAddPostToThisDiscussion)
PostToDiscussionCommand
constructor(discussionId, authorId, bodyText)
DiscussionApplicationService
postToDiscussion(command)
discussion = discussionRepository.get(command.discussionId)
author = collaboratorService.authorFrom(discussion.Id, command.authorId)
post = discussion.createPost(postRepository.nextIdentity(), author, command.bodyText)
postRepository.add(post)
DiscussionAggregate
// originalPoster is the Author that started the discussion
constructor(discussionId, originalPoster)
// if the discussion is closed, you can't create a post.
// *unless* if you're the author (OP) that started the discussion
createPost(postId, author, bodyText)
if (this.close && !this.originalPoster.equals(author))
throw "Discussion is closed."
return new Post(this.discussionId, postId, author, bodyText)
close()
if (this.close)
throw "Discussion already closed."
this.close = true
isClosed()
return this.close
The user goes to /showDiscussion/123 and he see the discussion with the <form> from which he can submit a new post, but only if the discussion is not closed or the current user is who started that discussion.
Or, the user goes to /showDiscussion/123 where it's presented as a REST-as-in-HATEOAS API. A hypermedia link to /addPost will be provided, only if the discussion is not closed or the authenticated user is who started that discussion.
How can I provide that knowledge into the UI?
I could code that into the read model,
canAddPostToThisDiscussion = !discussionWithAllThePosts.discussion.isClosed
&& discussionWithAllThePosts.discussion.originalPoster.id == session.currentUserId
but then I need to maintain that logic and keep it in sync with the write model. This is a fairly simple example, but as the states transitions of an aggregate become more complex, it may become really hard to do. I'd like to image my aggregates as state machines, with their workflows (like the RESTBucks example). But I don't like the idea to move that business logic outside my domain model, and put it in a service that both the read side and write side can use.
Maybe this isn't the best example, but as an aggregate root is basically a consistency boundary, we know that we need to prevent invalid state transitions in its life cycle, and in each transitions to a new state some operations may become illegal and vice versa. So, how can the user interface know what is allowed or not? What are my alternative? How should I approach to this problem? Do you have any example to provide?
How can I provide that knowledge into the UI?
The easiest way is probably to share the domain model's understanding of what is possible with the UI. Ta Da.
Here's a way to think about it -- in the abstract, all of the write model logic has a fairly simple looking shape.
{
// Notice that these statements are queries
State currentState = bookOfRecord.getState()
State nextState = model.computeNextState(currentState, command)
// This statement is a command
bookOfRecord.replace(currentState, nextState)
}
Key ideas here: the book of record is the authority of state; everybody else (including the "write model") is working with a stale copy.
What the model represents is a collection of constraints that ensure that the business invariant is satisfied. Over the lifetime of a system, there might be many different sets of constraints, as the understanding of the business changes.
The write model is the authority for which collection of constraints is currently enforced when replacing the state in the book of record. Everybody else is working with a stale copy.
The staleness is something to keep in mind; in a distributed system, any validation you perform is provisional -- unless you have a lock on the state and a lock on the model, either could be changed while your messages are in flight.
This means that your validation is approximate anyway, so you don't need to be too concerned that you have all of the fiddly details right. You assume that your stale copy of the state is approximately right, and your current understanding of the model is approximately right, and if the command is valid given those pre-conditions, then it is checked enough to send.
I don't like the idea to move that business logic outside my domain model, and put it in a service that both the read side and write side can use.
I think the best answer here is "get over it". I get it; because having the business logic inside the aggregate root is what the literature is telling us to do. But if you continue to refactor, identifying common patterns and separating concerns, you'll see that entities are really just plumbing around a reference to state and a functional core.
AggregateRoot {
final Reference<State> bookOfRecord;
final Model<State,Command> theModel;
onCommand(Command command) {
State currentState = bookOfRecord.getState()
State nextState = model.computeNextState(currentState, command)
bookOfRecord.replace(currentState, nextState)
}
}
All we've done here is taken the "construct the next state" logic, which we used to have scattered through out the AggregateRoot, and encapsulated it into a separate responsibility boundary. Here, its specific to the root itself, but an equivalent refactoring it so pass it as an argument.
AggregateRoot {
final Reference<State> bookOfRecord;
onCommand(Model<State,Command> theModel, Command command) {
State currentState = bookOfRecord.getState()
State nextState = model.computeNextState(currentState, command)
bookOfRecord.replace(currentState, nextState)
}
}
In other words, the model, teased out from the plumbing of tracking state, is a domain service. The domain logic within the domain service is just as much a part of the domain model as the domain logic within the aggregate -- the two implementations are dual to one another.
And there's no reason that a read model of your domain shouldn't have access to a domain service.
I don't like the idea of sharing domain knowlegde (code) between the write and the read models as you will have to continously keep them in sync and that'd really a chalenge even if you are the only developer in your company.
But the good knews is that you don't have to duplicate anything. If you designed your Aggregate to be pure, with no side effect as you should do (!), you can simply send it the command but without persisting the changes. If the command throws an exception then the command would not succeed, otherwise the command would succeed. In case of CQRS this is even better as you have a 3rd outcome: idempotent command detection in which case the command succeeds but it has no effect (no events are raised but no exception is thrown either) and the UI might find this interesting.
So, as an example you could have something like this:
DiscussionController
#Security(is_logged)
#Method('POST')
#Route('addPost')
addPostToDiscussionAction(request)
discussionService.postToDiscussion(
new PostToDiscussionCommand(request.discussionId, session.myUserId, request.bodyText)
)
#Method('GET')
#Route('showDiscussion/{discussionId}')
showDiscussionAction(request)
discussionWithAllThePosts = discussionFinder.findById(request.discussionId)
canAddPostToThisDiscussion = discussionService.canPostToDiscussion(request.discussionId, session.myUserId, "some sample body")
// render the discussion to the user, and use `canAddPostToThisDiscussion` to show/hide the form
// from which the user can send a request to `addPostToDiscussionAction`.
renderDiscussion(discussionWithAllThePosts, canAddPostToThisDiscussion)
DiscussionApplicationService
postToDiscussion(command)
discussion = discussionRepository.get(command.discussionId)
author = collaboratorService.authorFrom(discussion.Id, command.authorId)
post = discussion.createPost(postRepository.nextIdentity(), author, command.bodyText)
postRepository.add(post)
canPostToDiscussion(discussionId, authorId, bodyText)
discussion = discussionRepository.get(discussionId)
author = collaboratorService.authorFrom(discussion.Id, authorId)
try
{
post = discussion.createPost(postRepository.nextIdentity(), author, bodyText)
return true
}
catch (exception)
{
return false
}
You could even have a method named whyCantPostToDiscussion that would return the exception or the exception message and display it in the UI.
There is only one issue with the code: the call to postRepository.nextIdentity() because it would increase the next ID every time but you could replace it with something like postRepository.getBiggestIdentity() that should have no side effect.
I find it is rare that authorization is actually part of the domain. If it isn't, it makes sense to move that logic out into its own service which the UI and the domain can make use of.
I like to build up a set of rules using the specification pattern. I find it to be a fairly elegant way to build up the rules.
This also plays very well in a CQRS context as you can run each command through the 'rules engine' before they get issued to your AR's. If you push queries through a message routeing system you can do the same for queries. I've had a lot of success with this approach.
The response you are looking for is HATEOAS, look no further. You must implement your rest api as really restful (level 3) adhering to hypertext to model the state transitions and return links to the clients (being the UI one of those). These links represent the actions the user can execute in its context according to the model state. It´s simple. If you return a link from the server then you bind it to a button in the UI, if you don´t return the link because of business invariants then you do not show the button on the UI. There is a lot more of concepts behind it such as designing a good API supporting a well designed domain model behind but this is the general idea around it and fits exactly what you want.

'Maximum number of expressions in a list is 1000' error with Grails and Oracle

I'm using Grails with an Oracle database. Most of the data in my application is part of a hierarchy that goes something like this (each item containing the following one):
Direction
Group
Building site
Contract
Inspection
Non-conformity
Data visible to a user is filtered according to his accesses which can be at the Direction, Group or Building Site level depending on user role.
We easily accomplished this by creating a listWithSecurity method for the BuildingSite domain class which we use instead of list across most of the system. We created another listWithSecurity method for Contract. It basically does a Contract.findAllByContractIn(BuildingSite.listWithSecurity). And so on with the other classes. This has the advantage of keeping all the actual access logic in BuildingSite.listWithsecurity.
The problem came when we started getting real data in the system. We quickly hit the "ora-01795 maximum number of expressions in a list is 1000" error. Fair enough, passing a list of over 1000 literals is not the most efficient thing to do so I tried other ways even though it meant I would have to deport the security logic to each controller.
The obvious way seemed to use a criteria such as this (I only put the Direction level access here for simplicity):
def c = NonConformity.createCriteria()
def listToReturn = c.list(max:params.max, offset: params.offset?.toInteger() ?: 0)
{
inspection {
contract {
buildingSite {
group {
'in'("direction",listOfOneOrTwoDirections)
}
}
}
}
}
I was expecting Grails to generate a single query with joins that would avoid the ora-01795 error but it seems to be calling a separate query for each level and passing the result back to Oracle as literal in an 'in' to query the other level. In other words, it does exactly what I was doing so I get the same error.
Actually, it might be optimising a bit. It seems to be solving the problem but only for one level. In the previous example, I wouldn't get an error for 1001 inspections but I would get it for 1001 contracts or building sites.
I also tried to do basically the same thing with findAll and a single HQL where statement to which I passed a single direction to get the nonConformities in one query. Same thing. It solves the first levels but I get the same error for other levels.
I did manage to patch it by splitting my 'in' criteria into many 'in' inside an 'or' so no single list of literals is more than 1000 long but that's profoundly ugly code. A single findAllBy[…]In becomes over 10 lines of code. And in the long run, it will probably cause performance problems since we're stuck doing queries with a very large amount of parameters.
Has anyone encountered and solved this problem in a more elegant and efficient way?
This won't win any efficiency awards but I thought I'd post it as an option if you just plainly need to query a list of more than 1000 items none of the more efficient options are available/appropriate. (This stackoverflow question is at the top of Google search results for "grails oracle 1000")
In a grails criteria you can make use of Groovy's collate() method to break up your list...
Instead of this:
def result = MyDomain.createCriteria().list {
'in'('id', idList)
}
...which throws this exception:
could not execute query
org.hibernate.exception.SQLGrammarException: could not execute query
at grails.orm.HibernateCriteriaBuilder.invokeMethod(HibernateCriteriaBuilder.java:1616)
at TempIntegrationSpec.oracle 1000 expression max in a list(TempIntegrationSpec.groovy:21)
Caused by: java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in a list is 1000
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:440)
You'll end up with something like this:
def result = MyDomain.createCriteria().list {
or { idList.collate(1000).each { 'in'('id', it) } }
}
It's unfortunate that Hibernate or Grails doesn't do this for you behind the scenes when you try to do an inList of > 1000 items and you're using an Oracle dialect.
I agree with the many discussions on this topic of refactoring your design to not end up with 1000+ item lists but regardless, the above code will do the job.
Along the same lines as Juergen's comment, I've approached a similar problem by creating a DB view that flattens out user/role access rules at their most granular level (Building Site in your case?) At a minimum, this view might contain just two columns: a Building Site ID and a user/group name. So, in the case where a user has Direction-level access, he/she would have many rows in the security view - one row for each child Building Site of the Direction(s) that the user is permitted to access.
Then, it would be a matter of creating a read-only GORM class that maps to your security view, joining this to your other domain classes, and filtering using the view's user/role field. With any luck, you'll be able to do this entirely in GORM (a few tips here: http://grails.1312388.n4.nabble.com/Grails-Domain-Class-and-Database-View-td3681188.html)
You might, however, need to have some fun with Hibernate: http://grails.org/doc/latest/guide/hibernate.html

Reliable and efficient way to handle Azure Table Batch updates

I have an IEnumerable that I'd like to add to Azure Table in the most efficient way possible. Since every batch write has to be directed to the same PartitionKey, with a limit of 100 rows per write...
Does anyone want to take a crack at implementing this the "right" way as referenced in the TODO section? I'm not sure why MSFT didn't finish the task here...
Also I'm not sure if error handling will complicate this, or the correct way to implement it. Here is the code from the Microsoft Patterns and Practices team for Windows Azure "Tailspin Toys" demo
public void Add(IEnumerable<T> objs)
{
// todo: Optimize: The Add method that takes an IEnumerable parameter should check the number of items in the batch and the size of the payload before calling the SaveChanges method with the SaveChangesOptions.Batch option. For more information about batches and Windows Azure table storage, see the section, "Transactions in aExpense," in Chapter 5, "Phase 2: Automating Deployment and Using Windows Azure Storage," of the book, Windows Azure Architecture Guide, Part 1: Moving Applications to the Cloud, available at http://msdn.microsoft.com/en-us/library/ff728592.aspx.
TableServiceContext context = this.CreateContext();
foreach (var obj in objs)
{
context.AddObject(this.tableName, obj);
}
var saveChangesOptions = SaveChangesOptions.None;
if (objs.Distinct(new PartitionKeyComparer()).Count() == 1)
{
saveChangesOptions = SaveChangesOptions.Batch;
}
context.SaveChanges(saveChangesOptions);
}
private class PartitionKeyComparer : IEqualityComparer<TableServiceEntity>
{
public bool Equals(TableServiceEntity x, TableServiceEntity y)
{
return string.Compare(x.PartitionKey, y.PartitionKey, true, System.Globalization.CultureInfo.InvariantCulture) == 0;
}
public int GetHashCode(TableServiceEntity obj)
{
return obj.PartitionKey.GetHashCode();
}
}
Well, we (the patterns & practices team) just optimized for showing other things we considered useful. The code above is not really a "general purpose library", but rather a specific method for the sample that uses it.
At that moment we thought that adding that extra error handling would not add much, and we diceided to keep it simple, but....we might have been wrong.
Anyway, if you follow the link in the //TODO:, you will find another section of a previous guide we wrote that talks a little bit more on error handling in "complex" storage transactions (not in the "ACID" form though as transactions "ala DTC" are not supported in Windows Azure Storage).
Link is this: http://msdn.microsoft.com/en-us/library/ff803365.aspx
The limitations are listed in more detail there:
Only one instance of the entity should be present in the batch
Max 100 entities or 4 MB payload
Same PartitionKey (which is being handled in the code: notice that "batch" is only specified if there's a single Partition key)
etc.
Adding some extra error handling should not overcomplicate things too much, but depends on the type of app you are building on top of this and your preference to handle this higher or lower in your app stack. In our example, the app would never expect > 100 entities anyway, so it would simply bubble the exception up if that situation happens (because it should be truly exceptional). Same with the total size. The use cases implemented in the app make it impossible to have the same entity in the same collection, so again, that should never happen (and if it happens, it wouls simply throw)
All "entity group transactions" limitations are documented here: http://msdn.microsoft.com/en-us/library/dd894038.aspx
Let us know how it goes! I'm also interested to know if other pieces of the guide were useful for you.

Resources