Objectify, efficient relationships. Ref<> vs storing id and duplicating fields - performance

I'm having a hard time understanding Objectify entities relationship concepts. Let's say that i have entities User and UsersAction.
class User{
String nick;
}
class UsersAction{
Date actionDate;
}
Now in the frond-end app I want to load many UsersActions and display it, along with corresponding user's nick. I'm familiar with two concepts of dealing with this:
Use Ref<>,
I can put a #Load Ref in UsersAction, so it will create a link between this entites. Later while loading Users Action, Objectify will load proper User.
class User{
String nick;
}
class UsersAction{
#Load Ref<User> user;
Date actionDate;
}
Store Id and duplicate nick in UsersAction:
I can also store User's Id in UsersAction and duplicate User's nick while saving UsersAction.
class User{
String nick;
}
class UsersAction{
Long usersId;
String usersNick;
Date actionDate;
}
When using Ref<>, as far as I understand, Objectify will load all needed UsersActions, then all corresponding Users. When using duplication Objectify will only need to load UsersActions and all data will be there. Now, my question is. Is there a significant difference in performance, between this approaches? Efficiency is my priority but second solution seems ugly and dangerous to me since it causes data duplication and when User changes his nick, I need to update his Actions too.

You're asking whether it is better to denormalize the nickname. It's hard to say without knowing what kinds of queries you plan to run, but generally speaking the answer is probably no. It sounds like premature optimization.
One thing you might consider is making User a #Parent Ref<?> of UserAction. That way the parent will be fetched at the same time as the action in the same bulk get. As long as it fits your required transaction throughput (no more than 1 change per second for the whole User entity group), it should be fine.

Related

Entity/Domain purety dilemma in the clean architecutre/Domain driven design

Im working on a eCommerce system in which I try to implement the clean architecture.
But currently Im stuck a little bit.
So I have a use case called: CreateItemUseCase in which I create a Item (alias product) for the shop.
In this use case I call a method (createItemEntity()) of a Entity called ItemEntity.
This method creates just a data object with data like:
userId
itemTitle
itemDescription
...
So now I need another method in the ItemEntity which validates the userId.
To create a Item the user needs to have a userId so the method in the ItemEntity would be called:
validateUserId()
This method should check if the user has a userId in the database and if not the Item creation would be imposible.
Now my question:
How do I validate the userId?
Should I have the validateUserId() method take a array as a parameter, In which all the User Id´s are saved... something like this:
validateUserId(toBeValidated: Int, allUserIds: Array[Int])
{
// loop through the allUserIds to see if toBeValidated is in there ...
}
Or should I query the data in the method (which Im pretty sure, would violate the dependencie rule) like this:
validateUserId(toBeValidated: Int)
{
// get all user id´s through a query, and check if toBeValidated is in there ...
}
Or should I do it completly different?
In general, entities should only contain logic that is operating on information (data) that is within the entity's scope. Knowing how to query if a user with a certain user id exists or not is not in the scope of the item entity.
I think your motivation to keep all the logic for validation together is reasonable but on the other hand you should not introduce infrastructure dependencies (like talking to the database or user repository) to the entity. Knowing how to query if a user with a certain user id exists or not is not in the scope of the item entity.
Or should I query the data in the method (which Im pretty sure, would violate the dependencie rule) like this
Exactly, that's why it's usually best trying to avoid that to keep entities free from such dependencies. Introducing such dependencies can easily get out of hand and also increase complexity for testing such entities. If you need to do that it should be a very thought decision that justifies that.
Should I have the validateUserId() method take a array as a parameter, In which all the User Id´s are saved... something like this
This is not such a bad idea in general, because you would not make the entity dependent on infrastructure and provide the entity with all the data it needs for decision making. But on the other hand now you can run into another problem: bad performance.
Now you would retrieve all user ids everytime you create an item. If you would do the check for the user's existence somewhere else this can be optimized much better.
I suggest to ask the user repository beforehand if the user exists prior to performance the entity creation including all the other potentially required validations inside item entity that make sense there. The user repository could have a query that optimizes for just checking for the existence of this user by id.
In case these two operations (asking for the user's existence and creating the new item) only happen at one place of the application I'd be pragmatic and perform the user existence check directly in the use case. If this would occur from different places in your application you can extract that logic into a separate (domain) service (e.g. item service) which deals with the repetitive flow operations working with the user repository and item entity.
What you are dealing here with is a trade-off decision between domain model purity, domain model completeness and performance considerations. In this great blog this is named the Domain-Driven Design Trilemma. I suggest going through the reasoning in the article, I'm pretty sure it will help you coming to a final decision.
I think this is one of side case of what we call Business Gerunds
Details: https://www.forbes.com/sites/forbestechcouncil/2022/05/19/10-best-practices-for-event-streaming-success/
If Item has to validate the user, just see what common attributes are there between entities and who is responsible for change of those, and then a segregation can be done in DDD representation, and using a composite via transaltion, outside world entities can exist as the same

Dynamically constructing composition object

I have a review table for courses which is made up of multiple objects for different courses.A student should review the courses he is enrolled in every month.The Math,Science,History are tables by themselves but I store foreign keys in the Review table so that each review for the courses is associated with the respective table.
NOTE:a student can only be enrolled in two courses
#Entity
class Review{
//multiple time fields here here
#OneToOne(cascade=CascadeType.ALL,optional=true)
#JoinColumn(name="math_review_id")
Math m;
#OneToOne(cascade=CascadeType.ALL,optional=true)
#JoinColumn(name="science_review_id")
Science s;
#OneToOne(cascade=CascadeType.ALL,optional=true)
#JoinColumn(name="history_review_id")
History h;
}
Super Class
#MappedSuperclass
class Course {
#Id
#GeneratedValue(strategy=GenerationType.IDENTITY)
#Column(name="id")
int id;
#ManyToOne(fetch = FetchType.LAZY,
cascade = { CascadeType.DETACH,
CascadeType.MERGE,
CascadeType.PERSIST,
CascadeType.REFRESH },
)
#JoinColumn(name = "student_id")
private Student student;
}
Subclass History
#Entity
class History extends Course{
//fields specific to history course
}
Subclass Math
#Entity
class Math extends Course{
//fields specific to math course
}
Student class
#Entity
class Student{
//fields name,id,...
#OneToMany(mappedBy = "student",
cascade = CascadeType.ALL,
fetch = FetchType.LAZY)
private List<Review> reviewsList;
}
I check what courses the student is enrolled in and initialize the Math,Science,History accordingly.I pass a Review object to my reviews.jsp and save the returned #ModelAttribute using hibernate.I dont initialize the courses the student is not enrolled in.I thought uninitialized objects wont be saved but hibernate makes null entries even if not initialized ( I think because they are mapped to a table and are inside a persistent class). I need help how to dynamically construct Review object just with the courses the student is enrolled in.My current design might have flows,any better design suggestions are much appreciated(I have minimal experience in Java and hibernate)
As a suggestion I think you should be weary of creating a class per course. Would it not be sufficient to have a Course class which has a member of type, which could be Math, Science or History. Even that type could itself be an entity: CourseType, which you could have entries for so in your code there would be no Math, Science or History. Instead those are in a database, instead of code.
The Review object would only then interact with a Course. Just think of also all the work you will need to do when you add another course. You will have to update many different files and even add a table in your database, I don't believe you should need to do that.
I imagine you may have some differences between Course classes, and it may be a bit awkward having all these in one class. But from my experience this is typically worth doing, as it drastically reduces the amount of code and allows for more courses to be added without code.
Edit I still strongly recommend you consider reevaluating your decision of 1 class per a course, but anyway your decision. It's really unclear what this Review object is. You say that there are only 2 courses a student will be enrolled in so I imagine that 2 of these fields are null, then. But then it confuses me because you have one class per course, but you have an overeaching review object across all subjects. I would have expected to see:
class EnrolementReview{
Course courseA;
Course courseB;
}
Otherwise if your review depends on fields in your Math, or Science courses, I would expect to have a review class for each course:
class MathReview {
MathCourse course;
}
Or you might have a generic base class for review
abstract class CourseReview<C extends Course> {
C course;
}
if you had common functionality between them. And then an SemesterReview class for reviewing 2 classes in a semester:
class SemesterReview{
CourseReview review1;
CourseReview review2;
}
As far as dynamic composition, IMO I don't think it makes much sense in a statically typed language this notion. You have builder patterns and cake patterns and the like. Some programming languages have some nice stuff in this area like Scala traits but the benefits are very limited, nothing you couldn't do in Java wait a couple of class casts which are a bit evil but it gets the job done.
For all the many permutations of design patterns and methods available to you as a developer, I think it's a bit easier to look at some of the outputs of your design decision, such as:
How many classes am I going to write?
Will I be able to add more courses without writing code and changing
the database tables?
Can somebody else understand my code?
Last Edit
In regards to many null field being a concern, you have some options. Either you have null fields (which you seem to not like), or you do something like encapsulate variable types as an entity (for example each course has a list of Strings, Integers, Doubles etc), which I've seen used quite a bit in many different situations. That works out ok but you do delay some areas which may have been at compile to run-time, as you may need an integer which has a variable name of scienceCategory etc. It also can be awkward if you have some structured data. In general that approach is only good if you really don't know how a client is going to use your system and so you expose more of it to them to use.
However my personal favourite, is to follow the natural composition of your class, and encapsulate families of variables into their own class, which you expect wont always be applicable, as in if one isn't applicable all the others wont be either. The logic whether these classes are present should be very explicit however, either they should be Optional<ScienceInformation> or you should have some methods somewhere which returns a boolean whether or not this option should exist or not. Then operations can be performed on those options in the system you create. Just need to be careful you don't create objects which are too deeply nested as in objects which are made up of objects, which are made up of objects, (not always a problem but usually it is).
But really I don't think it's super important what way you choose, none of them will give you the comfy feeling that writing a class gives you. Just need to think about how you are going to abstract over these entities (eg. Course) in a way which will not lead you to an un-maintainable mess. You clearly have a very complete knowledge of your domain but you should write your code in a way where I (somebody who doesn't know about how many courses are in a semester) can read the code and then find that out without reading comments (that would be cheating), what the answer to that question is.

'Existing Entity' constraint

I'm reading some data from an excel file, and hydrating it into an object of class A. Now I have to make sure that one of the fields of the data corresponds to the Id of a specific Entity. i.e:
class A{
protected $entityId;
}
I have to make sure that $entityId is an existing id of a specific entity (let's call it Foo). Now this can be achieved using the choice constraint, by supplying the choices option as all of the existing ids of Foo. However this will obviously cause a performance overhead. Is there a standard/better way to do this?
I'm a bit confused about what you are doing, since you seem to talk about Excel parsing, but at the same time you mention choices, which in my opinion relate to Forms.
IMO you should handle directly the relationship to your entity, instead of only its id. Most of the time it is always better to have directly the related entity as attribute of your class A than only the id, and Symfony manipulates such behaviours pretty well.
Then just have your Excel parser do something like this:
$relatedEntity = $this->relatedEntityRepository->find($entityId);
if (!$relatedEntity) {
throw new \Exception();
}
$entity->setRelatedEntity($relatedEntity);
After doing this, since you were talking about Forms, you can then use an EntityType field which will automatically perform the request in database. Use query_builder if you need to filter the results.

Validation on domain entities along with MVP

How do you apply validation in an MVP/domain environment ?
Let me clearify with an example:
Domain entity:
class Customer
{
string Name;
etc.
}
MVP-model
class CustomerModel
{
string Name;
etc.
}
I want to apply validation on my domain entities but the MVP model has it's own model/class
apart from the domain entity, does that mean I have to copy the validation code
to also work on the MVP-model?
One solution I came up with is to drop the MVP-model and use the domain entity as MVP-Model,
but I don't want to set data to the entities that isn't validated yet.
And second problem that rises is that if the entity has notify-events,
other parts of the application will be affected with faulty data.
A third thing with that approach is if the user edits some data and then cancels the edit, how do I revert to the old values ? (The entity might not come from a DB so reloading the entity is't possible in all cases).
Another solution is to make some sort of copy/clone of the entity in question and use the copy as MVP-model, but then it might get troublesome if the entity has a large object graph.
Anyone has some tips about these problems?
Constraining something like the name of a person probably does not rightfully belong in the domain model, unless in the client's company there is actually a rule that they don't do business with customers whose names exceed 96 characters.
String length and the like are not concerns of the domain -- two different applications employing the same model could have different requirements, depending on the UI, persistence constraints, and use cases.
On the one hand, you want to be sure that your model of a person is complete and accurate, but consider the "real world" person you are modeling. There are no rules about length and no logical corollary to "oops, there was a problem trying to give this person a name." A person just has a name, so I'd argue that it is the responsibility of the presenter to validate what the user enters before populating the domain model, because the format of the data is a concern of the application moreso than the domain.
Furthermore, as Udi Dahan explains in his article, Employing the Domain Model Pattern, we use the domain model pattern to encapsulate rules that are subject to change. That a person should not a have a null name is not a requirement that is likely ever to change.
I might consider using Debug.Assert() in the domain entity just for an added layer of protection through integration and/or manual testing, if I was really concerned about a null name sneaking in, but something like length, again, doesn't belong there.
Don't use your domain entities directly -- keep that presentation layer; you're going to need it. You laid out three very real problems with using entities directly (I think Udi Dahan's article touches on this as well).
Your domain model should not acquiesce to the needs of the application, and soon enough your UI is going to need an event or collection filter that you're just going to have to stick into that entity. Let the presentation layer serve as the adapter instead and each layer will be able to maintain its integrity.
Let me be clear that the domain model does not have to be devoid of validation, but the validation that it contains should be domain-specific. For example, when attempting to give someone a pay raise, there may be a requirement that no raise can be awarded within 6 months of the last one so you'd need to validate the effective date of the raise. This is a business rule, is subject to change, and absolutely belongs in the domain model.

Fat Domain Models => Inefficient?

Looking at DDD, we abstract the database into the various models which we operate on and look at it as a repository where our models live. Then we add the Data Layers and the Service/Business layers on top of it. My question is, in doing so, are we creating inefficiencies in data transfer by building fat models?
For example, say we have system that displays an invoice for a customer on the screen.
Thinking of it in terms of OOP, we'd probably end up with an object that looks somewhat like this:
class Invoice {
Customer _customer;
OrderItems _orderitems;
ShippingInfo _shippingInfo;
}
class Customer {
string name;
int customerID;
Address customerAddress;
AccountingInfo accountingInfo;
ShoppingHistory customerHistory;
}
(for the sake of the question/argument,
let's say it was determined that the customer class had to
implement AccountingInfo and ShoppingHistory)
If the invoice solely needs to print the customer name, why would we carry all the other baggage with it? Using the repository type of approach seems like we would be building these complex domain objects which require all these resources (CPU, memory, complex query joins, etc) AND then transferring it over the tubes to the client.
Simply adding a customerName property to the invoice class would be breaking away from abstractions and seems like a horrible practice. On the third hand, half filling an object like the Customer seems like a very bad idea as you could end up creating multiple versions of the same object (e.g. one that has a an Address, but no ShoppingHistory, and one that has AccountingInfo but no Address, etc, etc). What am I missing, or not understanding?
As good object relational mappers can lazy load the relationships, you would therefore pull back the customer for your invoice, but ignore their accounting and shopping history. You could roll your own if you're not using an oject-relational mapper.
Often you can't do this within your client as you'll have crossed your trasaction boundary (ended your database trasaction) and so it is up to your service layer to ensure the right data has been loaded.
Testing the right data is available (and not too much of it) is often good to do in unit tests on a service layer.
You say "it was determined that the customer class had to implement AccountingInfo and ShoppingHistory", so clearly displaying an invoice is NOT the only task that the system performs (how else was it "determined" that customers need those other functionalities otherwise?-).
So you need a table of customers anyway (for those other functionalities) -- and of course your invoice printer needs to get customer data (even just the name) from that one table, the same one that's used by other functionality in the system.
So the "overhead" is purely illusory -- it appears to exist when you look at one functionality in isolation, but doesn't exist at all when you look at the whole system as an integrated whole.

Resources