DDD and Entity Framework, Filters - performance

So I am struggling with the approach DDD has to follow when we talk about filtering and queries. From this SO question Is it okay to bypass the repository pattern for complex queries? I can see the filtering by User should be done after getting all the Products. The piece of code from the accepted answer is:
Products products = /* get Products repository implementation */;
IList<Product> res = products.BoughtByUser(User user);
But wait, and if the database has 1 million Products? Isn't the best approach to do this filter directly in the database like so:
productsRepository.Find(p => p.User.Id == userId);
But from my actual knowledge of DDD this would be wrong, because this logic should be inside the Product itself.
Therefore, how to handle this scenario?

I agree with Yorro's answer. According to the comment, products is indeed a repository.
The question around performance of the underlying datastructure vs keeping the domain knowledge in the application could be explored further though.
Databases are great at filtering and querying data, they are optimized to do so, and for us to ignore that simply to "keep our knowledge in the domain" is naive.
Your example shows Repository Specialization, which is fine albeit verbose.
The logic of that search is encapsulated by that call, and as long as the interface for calling that method is in the domain, and the implementation in the data-layer, everything is fine.
Indeed the call could be to a stored-procedure that performs a very complex operation. (In this case, yes some of your logic has escaped the domain, but you make it as a conscious decision, and should you introduce another data technology, you would have to implement that functionality again.)
There is another option...
We can encapsulate the logic of the search in a Specification (http://en.wikipedia.org/wiki/Specification_pattern) and pass the specification from our domain logic code to our Repository who would interpret the specification and do the query.
That makes our domain oblivious of how the underlying data structure works, but it puts it in control of what the search criteria is.
I usually find myself implementing a blend of Repository Specialization, and having a base repository that accepts an ISpecification for more lightweight queries.

Based on your link, the Products class is the repository, just named without the "repository" suffix.
You are correct that the filtering should be in the database, you just don't see it because you are in the domain.
The first and second approach are the same. The difference is that the first is more align with DDD because of the proper usage of the ubiquitous language
// First example
// Take note, the products IS the repository
IList<Product> productsByUser = products.BoughtByUser(User user);
// Second example
IList<Product> productsByUser = productsRepository.Find(p => p.User.Id == userId);
If you dive in the data access layer, you can see the filtering that you are talking about.
public IList<Product> BoughByUser(User user)
{
IList<Product> products = this.dbContext.Products.Find(p => p.User.Id == user.ID);
return products;
}

This is not the direct answer of your question (Yorro's answer is right) but maybe it helps you to better understand DDD. This is a "wrong way, turn back" answer.
Your views doesn't need domain rules; doesn't need aggregates with 1 million of childs or 1 million of entities. So, you don't need to "bypass" the product repository because you should have "View Services" with "View Repositories" wich allows you to query (and paging, etc) denormalize data from persistence for your views.
You should apply domain rules using aggregates/entities when update/insert/delete is needed.
Once the user select one or several products from the 1 million list and push, for example, delete button you should use product repository to retrieve the aggregate/entity of selected products, apply delete rules and invariants and save in persistence.

Related

Entity/Domain purety dilemma in the clean architecutre/Domain driven design

Im working on a eCommerce system in which I try to implement the clean architecture.
But currently Im stuck a little bit.
So I have a use case called: CreateItemUseCase in which I create a Item (alias product) for the shop.
In this use case I call a method (createItemEntity()) of a Entity called ItemEntity.
This method creates just a data object with data like:
userId
itemTitle
itemDescription
...
So now I need another method in the ItemEntity which validates the userId.
To create a Item the user needs to have a userId so the method in the ItemEntity would be called:
validateUserId()
This method should check if the user has a userId in the database and if not the Item creation would be imposible.
Now my question:
How do I validate the userId?
Should I have the validateUserId() method take a array as a parameter, In which all the User Id´s are saved... something like this:
validateUserId(toBeValidated: Int, allUserIds: Array[Int])
{
// loop through the allUserIds to see if toBeValidated is in there ...
}
Or should I query the data in the method (which Im pretty sure, would violate the dependencie rule) like this:
validateUserId(toBeValidated: Int)
{
// get all user id´s through a query, and check if toBeValidated is in there ...
}
Or should I do it completly different?
In general, entities should only contain logic that is operating on information (data) that is within the entity's scope. Knowing how to query if a user with a certain user id exists or not is not in the scope of the item entity.
I think your motivation to keep all the logic for validation together is reasonable but on the other hand you should not introduce infrastructure dependencies (like talking to the database or user repository) to the entity. Knowing how to query if a user with a certain user id exists or not is not in the scope of the item entity.
Or should I query the data in the method (which Im pretty sure, would violate the dependencie rule) like this
Exactly, that's why it's usually best trying to avoid that to keep entities free from such dependencies. Introducing such dependencies can easily get out of hand and also increase complexity for testing such entities. If you need to do that it should be a very thought decision that justifies that.
Should I have the validateUserId() method take a array as a parameter, In which all the User Id´s are saved... something like this
This is not such a bad idea in general, because you would not make the entity dependent on infrastructure and provide the entity with all the data it needs for decision making. But on the other hand now you can run into another problem: bad performance.
Now you would retrieve all user ids everytime you create an item. If you would do the check for the user's existence somewhere else this can be optimized much better.
I suggest to ask the user repository beforehand if the user exists prior to performance the entity creation including all the other potentially required validations inside item entity that make sense there. The user repository could have a query that optimizes for just checking for the existence of this user by id.
In case these two operations (asking for the user's existence and creating the new item) only happen at one place of the application I'd be pragmatic and perform the user existence check directly in the use case. If this would occur from different places in your application you can extract that logic into a separate (domain) service (e.g. item service) which deals with the repetitive flow operations working with the user repository and item entity.
What you are dealing here with is a trade-off decision between domain model purity, domain model completeness and performance considerations. In this great blog this is named the Domain-Driven Design Trilemma. I suggest going through the reasoning in the article, I'm pretty sure it will help you coming to a final decision.
I think this is one of side case of what we call Business Gerunds
Details: https://www.forbes.com/sites/forbestechcouncil/2022/05/19/10-best-practices-for-event-streaming-success/
If Item has to validate the user, just see what common attributes are there between entities and who is responsible for change of those, and then a segregation can be done in DDD representation, and using a composite via transaltion, outside world entities can exist as the same

Should I save related models in repository?

I am working with Laravel for almost two years and trying to understand all the benefits of using Repositories and DDD. I still struggle with how to use best practices for working with data and models for better code reusability and nicer Architecture.
I have seen other developers suggesting to generate models in factories and then use Repositories for saving these models like :
public function add(User $user)
{
return $user->save();
}
but what should I do, in case my user model has models related with it, like images, description and settings.
Should I create repository for each model and call ->add() function 4 times in the controller or should I place the saving logic inside the UserRepository ->add() function passing all models as well as user? Also, how about update function, that logic might also be quite complicated.
Update - what I need is a practical example with realization.
It's always difficult to deal with "right way" questions. But here is one way.
From a DDD perspective, in this specific context, treat the User object as an aggregate root entity and the other objects as child value objects.
$description = new UserDescripton('Some description');
$image1 = new UserImage('head_shot','headshot.jpg');
$image2 = new UserImage('full_body','fullbody.jpg');
$user = new User('The Name',$description,[$image1,$image2]);
$userRepository->persist($user);
First thing to note is that you if really want to try and apply some of the ddd concepts then it is important to think in terms of domain models without worrying about how to persist them. If you find that you are basically writing a CRUD app with a bunch of getters and setters and almost no business logic then pretty much forget about it. All you will end up doing is to add complexity without much value.
The persist line is where the user will get stored. And you certainly don't want to have to write a bunch of code to store and update the children. Likewise, it would normally be waste of effort to make repositories for value objects. If you are going this route then you really need some sort of database layer that understands individual objects as well as their relations. It is the relations that are the key.
I assume you are using Laravel's Eloquent active record persistence layer. I'm not familiar enough with it to know how easy it is to persist and update an aggregate root.
The code I showed is actually based more on Doctrine 2 Object Relation Mapper and pretty much works out of the box. http://docs.doctrine-project.org/projects/doctrine-orm/en/latest/ It is easy enough to integrate it with Laravel.
But even Doctrine 2 is largely CRUD oriented. In different domain contexts, the user object will be treated differently. It can start to get a bit involved to basically have different user implementations for different contexts. So make sure that the payoff in the domain layer is worth the effort.
I am not a PHP guy but from what I can find, Laravel an MVC framework, which has nothing to do with DDD.
Check this presentation, it does not to go to domain modelling, more concentrating on tactics but at least it has some goodness like command handling and domain events, briefly explains repositories with active record.
It also has references to two iconic DDD books at the last slide, I suggest you have a look at those too.

Does business logic live in the Model or in the Controller?

After reading https://softwareengineering.stackexchange.com/questions/165444/where-to-put-business-logic-in-mvc-design/165446#165446, I am still confused as to where I want to put my code for computing discounts. Computing a discounted price for a product or service I think is definitely part of business logic and is not part of application routing or database interaction. With that, I am struggling where to place my business logic.
Example
class Model
{
public function saveService($options)
{
$serviceId = $options['service_id'];
//Model reads "line Item" from database
$service = $this->entityManager->find('Entity\ServiceLineItem', $serviceId);
//uses the data to compute discount
//but wait.. shouldn't model do CRUD only?
//shouldn't this computation be in Controller?
$discount = ($service->getUnitPrice() * 0.25);
// Add the line item
$item = new SalesItem();
$item->setDiscount($discount);
}
}
class Controller
{
function save()
{
$this->model->saveService($options);
}
}
Question:
Above $discount computation, should it stay in Model, or does it go into Controller? If it goes into Controller, Controller has to call $service (via Model) first, then compute $discount inside Controller then send it the value back to the Model to be saved. Is that the way to do it?
Note
I may be confusing Model with "Storage". I probably need to have a Model where I do business logic and Database/Persistent Storage should be a separate layer.
Business logic belongs in a Service, so you would need to add a service layer.
Business logic tends to span multiple models, which infers that it does not belong in any single model. Therefore, it is unclear in which model you should put the logic.
Enter the service class.
I tend to make a service for each use-case the software is designed for.
In your case there could be a CheckOutService which would calculate your total sum, including discounts.
That way each service has a specific and intuitive purpose. Then, when a controller requires some business logic, it calls a service tailored for that very purpose.
A good description of what a service does in the MVC pattern can be found here:
MVCS - Model View Controller Service
I'll quote the most essential part:
The service layer would then be responsible for:
Retreiving and creating your 'Model' from various data sources (or data access objects).
Updating values across various repositories/resources.
Performing application specific logic and manipulations, etc.
edit: I just saw kayess answer. I'd say his answer is more precise than mine, but I don't know how relatable it is, given the degree of uncertainty in your question. So I hope mine can help some people in the earlier stages of their programming career.
Answering a question like this is usually opinionated or one could say it really depends on your business use case.
First off, i sense some mixup in your Model and service wording here.
A model should be your domain model, and service should either be a domain or an application service, in a different class or a different layer.
Architecturally thinking you could follow a rather simplified DDD-ish implementation.
Namely:
You implement all behavioural concepts in your domain model, which are related to the models current state
Use a factory that will query the line item from the repository
Use a domain service for mostly everything that's not related to your model, in your case calculating discount (if you need that logic reusable, for example in more models or services). You don't want to pollute your domain model with non related mechanisms and dependencies
For persistence you better use a different layer and by this you better separate most concerns you can, a reason for this is testability and another reason -amongst many others- for less code changes later
To achieve a cleaner architecture and less pain by maintaing your code in the longer run, don't forget to think about how design patterns and SOLID principles could help you implement your solution to the given business use case.
The question about how to separate business logic from data is not easily answered. However, Daniel Rocco has constructed a good discussion of the subject that you may find helpful, if not for this particular problem, then for structuring business applications in general.
I used a framework called "Yii Framework" using MVC, what it had was a function called beforeSave() in the controllers that was used to change the model values just before saving them.
Following this logic maybe the best practice would be to apply the discount to your price just before saving the model (in your Controller)
The Below Image shows how I think where your "Business" logic should go :)
The above image has the "Service" layer as the Server Side "Controller".
It is the middle man between the "Client" side Controller and the "Logic" layer.
All of your tasks that requires some type of "Logic" like
"Computing a discounted price for a product or service"
Would go into a ProductLogic Class where it takes inputs From the Service if needed and uses that information to help calculate the discounted price.
The ProductLogic Class will also query a "Data" source if needed to get the current price of an item.
The ProductLogic Class will piece together the information it collected from the Service and Repository to make the calculation and if it needs to be return to the user then the ProductLogic class will send it to the Service layer.
If it just needs to be saved in the Repository than the Logic will pass off the information to the Repository to handle.
Hope this helps :)
Have a great day!

Validation on domain entities along with MVP

How do you apply validation in an MVP/domain environment ?
Let me clearify with an example:
Domain entity:
class Customer
{
string Name;
etc.
}
MVP-model
class CustomerModel
{
string Name;
etc.
}
I want to apply validation on my domain entities but the MVP model has it's own model/class
apart from the domain entity, does that mean I have to copy the validation code
to also work on the MVP-model?
One solution I came up with is to drop the MVP-model and use the domain entity as MVP-Model,
but I don't want to set data to the entities that isn't validated yet.
And second problem that rises is that if the entity has notify-events,
other parts of the application will be affected with faulty data.
A third thing with that approach is if the user edits some data and then cancels the edit, how do I revert to the old values ? (The entity might not come from a DB so reloading the entity is't possible in all cases).
Another solution is to make some sort of copy/clone of the entity in question and use the copy as MVP-model, but then it might get troublesome if the entity has a large object graph.
Anyone has some tips about these problems?
Constraining something like the name of a person probably does not rightfully belong in the domain model, unless in the client's company there is actually a rule that they don't do business with customers whose names exceed 96 characters.
String length and the like are not concerns of the domain -- two different applications employing the same model could have different requirements, depending on the UI, persistence constraints, and use cases.
On the one hand, you want to be sure that your model of a person is complete and accurate, but consider the "real world" person you are modeling. There are no rules about length and no logical corollary to "oops, there was a problem trying to give this person a name." A person just has a name, so I'd argue that it is the responsibility of the presenter to validate what the user enters before populating the domain model, because the format of the data is a concern of the application moreso than the domain.
Furthermore, as Udi Dahan explains in his article, Employing the Domain Model Pattern, we use the domain model pattern to encapsulate rules that are subject to change. That a person should not a have a null name is not a requirement that is likely ever to change.
I might consider using Debug.Assert() in the domain entity just for an added layer of protection through integration and/or manual testing, if I was really concerned about a null name sneaking in, but something like length, again, doesn't belong there.
Don't use your domain entities directly -- keep that presentation layer; you're going to need it. You laid out three very real problems with using entities directly (I think Udi Dahan's article touches on this as well).
Your domain model should not acquiesce to the needs of the application, and soon enough your UI is going to need an event or collection filter that you're just going to have to stick into that entity. Let the presentation layer serve as the adapter instead and each layer will be able to maintain its integrity.
Let me be clear that the domain model does not have to be devoid of validation, but the validation that it contains should be domain-specific. For example, when attempting to give someone a pay raise, there may be a requirement that no raise can be awarded within 6 months of the last one so you'd need to validate the effective date of the raise. This is a business rule, is subject to change, and absolutely belongs in the domain model.

Repository Pattern: What is the 'right size'?

I'm building some repositories for an MVC application, and I'm trying to come up with the right way to divide responsibilities between repositories. In most cases, this is obvious. But there is one particular case where I'm not sure what the right answer is.
The users of this application need to track multiple types of time for their employees. For simplicity, let's consider only two. I'll call them "time cards" and "attendance." The exact nature of the difference between these two is not really important, but you should note that the end-users consider them entirely separate data. I think, though, that the reason they consider them entirely separate data is that they have never really had the opportunity to see them together in the past. Both types of records have almost entirely different business rules concerning editing the records, but they are also, generally speaking, both records of where an employee was at a particular time. Both types of time records have a great deal of properties in common, such as a total number of hours, and an employee for whom the time was collected. Both types also have a few properties which are completely unique to the individual type. We're keeping these "extra" properties in an instance of another type. So the general structure looks like this:
class TimeRecord
{
Person Employee { get; set; }
TimeSpan? Hours { get; set; }
}
class TimeCardData
{
TimeRecord Record { get; set; }
TProperty TimeCardProperty { get; set; }
}
class AttendanceData
{
TimeRecord Record { get; set; }
TProperty AttendanceProperty { get; set; }
}
So the question is, How many repositories are required here?
1 Repository
A design with only one repository would expose methods to return "time cards", "attendance" records, or both types in one list. This is fairly convenient for clients of the repository, but, to my mind, has a real danger of becoming a very fat class. I think that a repository for just "time cards" is already going to be one of the largest repositories in the system even without also handling "attendance" simply due to the complex business rules involved.
2 Repositories
Another design would have one repository for "time cards" and another repository for "attendance" records. This has the advantage that the business rules for, e.g., "time cards" are in a place by themselves. But I'd also like to have a way to get a list of all time records, regardless of type. It's not clear which repository to use for this case. Both?
3 Repositories
A design with one repository for "time cards", another repository for "attendance" records, and a third repository to deliver a read-only list of all time records is also a possibility. Like the 2 repository design, this has the advantage that the business rules for, e.g., "time cards" are in a place by themselves. It's now clear where to get the combined list. But I find it a bit weird that I could get the same record from two different repositories.
Hybrid
A hybrid approach would use a single repository, but move any business rules code (including selection of records) into separate types. In this example, a single "time record repository" would aggregate instances of business rule implementation classes for "time card" and "attendance" time. I think this is the approach I'm favoring right now.
Other?
Anything I've missed? Any compelling arguments for one design over the other?
Repositories are, at least not to my knowledge, a place for business rules. They are just a facade meant to mimic a collection; underneath they're basically pure data access (if that's they're job, you may not be persisting anything with a Repository as well). So separate repositories should not be considered for "business rules" reasons.
If your domain objects are really separate objects, then you should have separate repositories. Remember what a repository is: it's a facade. It mimics a collection to your domain. See here for a really good blog post on Repositories: http://devlicio.us/blogs/casey/archive/2009/02/20/ddd-the-repository-pattern.aspx
The repository is a facade; an abstraction.
That said... I do not think you have separate objects. You've got some issues here that have nothing to do with repositories and everything to do with the domain and the design of the domain. Are the two types of "timecards" actually two different things, or are they really the same?
You say, "But I find it a bit weird that I could get the same record from two different repositories."
That tells me that they are actually the same data, expressed in different ways. And there are ways to handle that.
If this is really the case, then what you have here is subclasses of a common base class (something that can be modeled in a DB pretty easy and handled elegantly with NHibernate, for instance).
I'll give you an example of a project I am working on. I have something called a "Broadcast". It's a base class; abstract. Can't be instantiated. I have two specific concrete types of this class: DeviceBroadcast and FileBroadcast. One streams audio/video from a device (like a DirectX capture card) and one streams audio/video from a file source (like an .mp3).
I have one repository that returns a Broadcast object. I can cast it to a FileBroadcast to manipulate specific information about a FileBroadcast, or I can cast to a DeviceBroadcast for the same reason - if it is of that type. A Broadcast cannot be both a FileBroadcast and DeviceBroadcast type. It has to be one or the other.
In the database I store the generic broadcast parameters in a Broadcast table, and then I store the file specific properties in a FileBroadcast table. Same goes for the DeviceBroadcast table; separate. When I query via the Repository, however, I just want Broadcasts. That's my root aggregate object and thus that's my repository.
The Broadcast base class has common methods that both subclasses use (like the GetCommand() method, which returns a specific command-line argument to launch a VLC process). Subclasses have to override and implement that method because it's abstract. In this way, the "business logic" that is unique to a FileBroadcast is contained in the FileBroadcast class. The "business logic" that is unique to a DeviceBroadcast is contained in the DeviceBroadcast class. Any logic that is common to both is contained in the superclass, Broadcast.
You seem to have a similiar situation here and that's why I am sharing my design. I think it might serve you well.
Above all, think about your domain and the data. If you are going to get duplicate data by way of separate repositories, then you need to give more thought to how you're designing the domain. Don't let the users dictate your domain design either. They know the domain from their perspective. All you have to do is be able to present the data to them in a way they understand. That doesn't mean you have to have a bad design; You can have a good design behind the scenes because your code is the thing that has to use the domain.

Resources