About software design: Where I must check parameters? - validation

Imagine I have an application that request to the user a name, a category list. When user click on save button, application will save name and category to a database.
I have a layer that get name and category from UI. This layer check if there is a name (a string with length > 0). If this is correct, it will pass name a category to another layer. Note: category is a radiobutton list where one item is always selected.
On this second layer, application will select a suitable class to save name, depending on category.
On last layer, a class will save this name on a database. On this class I will check if name is empty or not.
My question is: where is the right place to check method's input parameters? on every layer? Maybe, I'm going to use these layers on others developments.
Is my example correct? Maybe, I can left validation on database layer and raise an exception to UI layer.

In general, in terms of the larger question about validating input that is ultimately persisted, it is best to:
Convert the input parameters to a
fully encapsulated business object as
soon as possible after you receive
it.
Validate early and fail fast then to
wait until you get to a lower layer
-- waste of resources, waste of time, possibly more complex (more things to
roll back).
Validate the business logic once and make that part of
your object instantiation process. (but note that validation of view logic and persistence logic may need to be done at the other layers and is separate from business logic)
Model how your object is persisted
using an ORM (e.g., Hibernate) so
that you could work purely at the
object level in memory and leave
persistence as an implementation
detail. Focus the business logic at
the object layer.
And in terms of method validation itself, I agree with Oded -- at every layer, and it should be done immediate upon method entry. Again, this is part of the fail fast methodology. But, as I noted above, this doesn't mean you validate business logic at every method (or every layer). I'm just referring to the basic practice of validating inputs (either by assertions or explicit checks and exceptions thrown).

where is the right place to check method's input parameters? on every layer?
Yes, on every layer. This is called defense in depth.
There are different reasons to do so on each layer:
UI/Client code: keep things responsive and avoid roundtrips when data is invalid
Business layer: Ensure business rules are kept
Data layer: Ensure that valid data is passed through

I disagree with the recommendation about every layer, all the time. It sounds too dogmatic to me.
All design represents a choice between functionality and cost. Your validation choices should reflect that as well.
I think your decisions should take into account the sharing and reuse of layers.
If the database is shared by more than one application, then the database cannot depend on every application validating properly. In that case, the database must validate to protect itself.
In this day and age of SQL injection attacks, I think binding and validating prior to reaching the persistence tier is a must. The schema should do what it must to ensure referential and business integrity (e.g. unique constraints on columns and required "not null"), but other validations should be done prior to hitting the database.
If the database is wholly owned by one and only one application, and there's a service that is the only gateway into the data, then validation can be done by the service. The cost of duplicating the validation on the database layer can be waived.
Likewise between the UI and the service layer.
Double validation on client and service tiers is common because the service is shared by many clients in a service oriented architecture. Today your service might be used by a browser based UI; suddenly there's a flock of mobile apps along side it that clamor for services, too. In that case the service absolutely must validate and bind each and every request.
No manufacturer polishes every surface to mirror quality. Sometimes a rough cast surface is permissible, because the benefit of grinding and polishing is negligible and the cost is too great.
Same with software.
I don't like dogmatic statements. It's better to understand the implications of your choices. Know the rules and when it's appropriate to break them.

If you're going for very loose coupling where other applications will also make use of these same layers, then I would recommend doing input checking at every step. Since each class/method should have no knowledge of expectation of the other classes/methods which ran before them in the stack, each one should enforce its requirements individually.
If the UI requires that a value be in the text box when the button is clicked, it should validate accordingly.
If the business logic requires that a name never be null/empty, it shouldn't allow a null/empty value to be placed in that property.
If the data layer has a constraint on that field requiring that a value be present, it should check that a value is present before trying to persist the data.

Related

Best practices in DDD to validate an object

Let's say I have an entity called Order. My Domain must be certain that, as it receives an Order to be included, the Client associated with is active.
What are the best practices to validate that?
In other words, what are the best practices to validate some entity related to the one that is being validated? Where should that validation be called?
Ideally: your decision making (computation, branching, and so on) would be in the domain model. Your retrieval of the information would be in your application code, and your application code would pass the information to the domain model as an argument.
Your domain entity holds its information, and everything else it needs to know gets passed to it.
The life-isn't-always-ideal version: if proactively fetching Client::isActive is expensive, then you probably don't want to always fetch the information. So you either (a) design a protocol such that the domain model tells the application code that it needs this information, and the application code fetches the information and passes it back or (b) you pass to the domain model the capability to look up the information for itself (aka, pass a "domain service" that knows how to fetch the information on demand).
The two approaches have different tradeoffs regarding code complexity, timing, error handling.

Shall I use a DTO or not?

I'm building a web application with Spring, and I'm at the point where I have an Entity, a Repository, a RestController, and I can access endpoints in my browser.
I'm now trying to return JSON data to the browser, and I'm seeing all of this stuff about DTOs in various guides.
Do I really need a DTO? Can't I just put the serialization logic on the entity itself?
I think, this is a little bit debatable question, where the short answer would be:
It depends.
Little longer answer
There are plenty of people, who, in plenty of cases, would prefer one approach (using DTOs) over another (using bare entities), and vice versa; however, there is no the single source of truth on which is better to use.
It very much depends on the requirements, architectural approach you decide to stick with, (even on) personal preference and other (project-related) specific details.
Some even claim that DTO is an anti-pattern; some love using them; some think, that data refinement/adjustment should happen on the consumer/client side (for various reasons, out of which, one can be No Policy for API changes).
That being said, YES, you can simply return the #Entity instance (or list of entities) right from your controller and there is no problem with this approach. I would even say, that this does not necessarily violate something from SOLID or Clean Code principles.again, it depends on what do you use a response for, what representation of data do you need, what should be the capacity and purpose of the object in question, and etc..
DTO is generally a good practice in the following scenarios:
When you want to aggregate the data for your object from different resources, i.e. you want to put some object transformation logic between the Persistence Layer and the Business(or Web) Layer:
Imagine you fetch from your database a List<Employee>; however, from another 3rd party web-service, you also receive some complementary-to-employee data for each Employee object, which you have to aggregate in the Employee objects (aggregate, or do some calculation, or etc. point is that you want to combine the data from different resources). This is a good case when you might want to use DTO pattern. It is reusable, it conforms to Single-Responsibility Principle, and it is well segregated from other layers;
When you don't necessarily combine data received from different sources, but you want to modify the entity which you will be returning:
Imagine you have a very big Entity (with a lot of fields), and the client, which calls the corresponding endpoint (Front-End application, Mobile, or any client), has no need of receiving this huge entity (or list of entities). If you, despite the client's requirement, will still be sending the original/unchanged entity, you will end up consuming network bandwidth/load inefficiently (more than enough), performance will be weaker, and generally, you will be just wasting computing resources for no good reason. In this case, you might want to transform your original Entity to the DTO object, which the client needs (only with required fields). Here, you might even want to implement different DTO classes, for one entity, for different consumers/clients.
However, if you are sure, that your table/relation representations (instances of #Entity classes) are exactly what the client needs, I see no necessity of introducing DTOs.
Supporting further the idea, that #Entity can be returned to the presentation layer without DTO
Java Persistence with Hibernate, Second Edition, in §3.3.2, even motivates it explicitly, that:
You can reuse persistent classes outside the context of persistence, in unit tests or in the presentation layer, for example. You can create instances in any runtime environment with the regular Java new operator, preserving testability and reusability;
Hibernate entities do not need to be explicitly Serializable;
You might also want to have a look at this question.
In general, it’s up to you to decide. If your application is relatively simple and you don’t expose any sensitive information, an response is y ambiguous for the client, there is nothing criminal in returning back the whole entity. If your client expect a small slice of entity, eg only 2-3 fields from 30 fields entity, then it make sense to do the translation or consider different protocol such as GraphQL.
It is ideal design where you should not expose the entity.
It is a good design to convert your entity to DTO before you pass the same to web layer.
These days RestJpacontrollers are also available.
But again it all varies from application to application which one to use.
If your application does a need only read only operation then make sense to use RestJpacontrollers and can use entity at web layer.
In other case where application modifies data frequently then in that case better option to opt DTO and use it at the UI layer.
Another case is of multiple requests are required to bring data for a particular task. In the same case data to be brought can be combined in a DTO so that only one request can bring all the required data.
We can use data of multiple entities data into one DTO.
This DTO can be used for the front end or in the rest API.
Do I really need a DTO? Can't I just put the serialization logic on the entity itself?
I'd say you don't, but it is better to use them, according to SOLID principles, namely single responsibility one. Entities are ORM should be used to interact with database, not being serialized and passed to the other layers.

Is there a elegant way of putting domain validations inside domain layer?

I am new to domain driven design and there is one thing that is bothering me when I'm writing domain model. How to handle domain validation?
I am designing library management system where user can search through books and see if book is on the stock.
If it is not, user can create request for book so some kind of queue is created. Rule is that we don't have any book in stock. Right now I have information about quantity inside book entity and that is not a problem but what if i have different bounded context for requesting books and book catalog. Then I must somehow contact another vertical/service and ask (validate) that book quantity is zero before creating book aggregate.
Also I am checking if user have valid membership card, is book already borrowed by him, do user have active requests for any book.
Things that bother me.
I need to know what exactly to include in aggregate before passing it to domain model because of validations. I am not sure that is safest approach because my validations accuracy will depend from specification/query, etc.
Another very important thing. When application layer method start with execution and something is not valid client will get only validation messages for code that was executed and there is good chance there is more things that are preventing code for execution. This can be really inconvenient if user is filling some form.
First thoughts for solving this problem.
I have command/handler architecture and I am using MediatR so I am thinking to move domain validations between command and handler and that will solve my problems for now but that approach will spread domain knowledge across bounded context and domain model will not be smart enough to guard from not valid actions. More precise I will need to think before executing application method (handler) what I need to validate.
So I am really curious. Is there any clear way of handling domain validations inside domain model?
Is there any clear way of handling domain validations inside domain model?
Yes; they require work and careful thinking.
One aspect of careful thinking is to distinguish message validation from domain logic. Message validation is an isolated thing, a message is valid or not according to the schema of the message -- are all of the required fields present, is the data in the right form, are the numbers in the allowed range, and so on. Really, we're asking the question "did the client fill out the form correctly?"
Integrating a valid message with previously known information (aka, the "state" of the domain model) is a domain logic concern. State is chosen deliberately - the domain model is a state machine for the bookkeeping of your domain.
Depending on your domain, and the information that is available, there can be states that mean that the client doesn't get what they want. "The road less traveled" doesn't mean that things are invalid.
Furthermore, if your system is distributed (different pieces of data are the responsibility of different authorities), then any locally cached copies of that data are necessarily stale, and may be out of date. See Pat Helland's Memories, Guesses, and Apologies. That we will sometimes produce an incorrect answer is an inevitable consequence of distributing the work. If we're responsible, then we performed a cost benefit analysis to ensure that the expected benefits of distributing the work offset the expected risks.

How to handle dependent behavior in a domain class?

Let's say I've got a domain class, which has functions, that are to be called in a sequence. Each function does its job but if the previous step in the sequence is not done yet, it throws an error. The other way is that each function completes the step required for it to run, and then executes its own logic. I feel that this way is not a good practice, since I am adding multiple responsibilities, and the caller wont know what all operations can happen when he invokes a method.
My question is, how to handle dependent scenarios in DDD. Is it the responsibility of the caller to invoke the methods in the right sequence? Or do we make the methods handle the dependent operations before it's own logic?
Is it the responsibility of the caller to invoke the methods in the right sequence?
It's ok if those methods have a business meaning. For example the client may book a flight, and then book a hotel room. Both of those is something the client understands, and it is the client's logic to call them in this sequence. On the other hand, inserting the reservation into the database, then committing (or whatever) is technical. The client should not have to deal with that at all. Or "initializing" an object, then calling other methods, then calling "close".
Requiring a sequence of technical calls is a form of temporal coupling, it is considered a bad practice, and is not directly related to DDD.
The solution is to model the problem better. There is probably a higher level use-case the caller wants achieved with this call sequence. So instead of publishing the individual "steps" required, just support the higher use-case as a whole.
In general you should always design with the goal to get any sequence of valid calls to actually mean something (as far as the language allows).
Update: A possible model for the mentioned "File" domain:
public interface LocalFile {
RemoteFile upload();
}
public interface RemoteFile {
RemoteFile convert(...);
LocalFile download();
}
From my point of view, what you are describing is the orchestration of domain model operations. That's the job of the application layer, the layer upon domain model. You should have an application service that would call the domain model methods in the right sequence, and it also should take into account whether some step has left any task undone, and in such case, tell the next step to perform it.
TLDR; Scroll to the bottom for the answer, but the backstory will give some good context.
If the caller into your domain must know the order in which to call things, then you have missed an opportunity to encapsulate business logic in your domain, which is a symptom of an anemic domain.
#RobertBräutigam made a very good point:
Requiring a sequence of technical calls is a form of temporal coupling, it is considered a bad practice, and is not directly related to DDD.
This is true, but it is worse when you do it with your domain model because non-domain concerns get intermixed with domain concerns. Intent becomes lost in a sea of non business logic. If you can, you look for a higher-order aggregate that encapsulates the ordering. To borrow Robert's example, rather than booking a flight then a hotel room, and forcing that on the client, you could have a Vacation aggregate take both and validate it.
I know that sounds wrong in your case, and I suspect you're right. There's a clear dependency that can't happen all at once, so we can't be the end of the story. When you have a clear dependency with intermediate transactions that must occur before the "final" state, we have... orchestration (think sagas, distributed transactions, domain events and all that goodness).
What you describe with file operations spans across transactions. The manipulation (state change) of a domain is transactional at each point in a distributed transaction, but is not transactional overall. So when #choquero70 says
you are describing is the orchestration of domain model operations. That's the job of the application layer, the layer upon domain model.
that's also correct. Orchestration is key. Each step must manipulate the state of the domain once, and once only, and leave it in a valid state, but it OK for there to be multiple steps.
Each of those individual points along the timeline are valid moments in the state of your domain.
So, back to your model. If you expose a single interface with multiple possible calls to all steps, then you leave yourself open to things being called out of order. Make this impossible or at least improbable. Orchestration is not just about what to do, but what to prevent from happening. Create smaller interfaces/classes to avoid accidentally increasing the "surface area" of what could be misused accidentally.
In this way, you are guiding the caller on what to do next by feeding them valid intermediate states. But, and this is the important part, the burden on what to call in what order is not on the caller. Sure, the caller could know what to do, but why force it.
Your basic algorithm is the same: upload, transform, download.
Is it the responsibility of the caller to invoke the methods in the right sequence?
Not exactly. Is the responsibility of the caller to choose from legitimate choices given the state of your domain. It's "your" responsibility to present these choices via business methods on your correctly modeled moment/interval aggregate suitable for the caller to use.
Or do we make the methods handle the dependent operations before it's own logic?
If you've setup orchestration correctly, this won't be necessary. But it does make sense to validate anyway.
On a side note, each step of the orchestration you do should be very linear in nature. I tell my developers to be suspicious of an orchestration step that has an if statement in it. If there's an if it's likely better to be part of another orchestration step or encapsulated in business logic.

Where do you perform your validation?

Hopefully you'll see the problem I'm describing in the scenario below. If it's not clear, please let me know.
You've got an application that's broken into three layers,
front end UI layer, could be asp.net webform, or window (used for editing Person data)
middle tier business service layer, compiled into a dll (PersonServices)
data access layer, compiled into a dll (PersonRepository)
In my front end, I want to create a new Person object, set some properties, such as FirstName, LastName according to what has been entered in the UI by a user, and call PersonServices.AddPerson, passing the newly created Person. (AddPerson doesn't have to be static, this is just for simplicity, in any case the AddPerson will eventually call the Repository's AddPerson, which will then persist the data.)
Now the part I'd like to hear your opinion on is validation. Somewhere along the line, that newly created Person needs to be validated. You can do it on the client side, which would be simple, but what if I wanted to validate the Person in my PersonServices.AddPerson method. This would ensure any person I want to save would be validated and removes any dependancy on the UI layer doing the work. Or maybe, validate both in UI and in by business server layer. Sounds good so far right?
So, for simplicity, I'll update the PersonService.AddPerson method to perform the following validation checks
- Check if FirstName and LastName are not empty
- Ensure this new Person doesn't already exist in my repository
And this method will return True if all validation passes and the Person is persisted, False if Validation fails or if the Person is not persisted.
But this Boolean value that AddPerson returns isn't enough for me at the UI layer to give the user a clear reason why the save process failed. So what's a lonely developer to do? Ultimately, I'd like the AddPerson method to be able to ensure what its about to save is valid, and if not, be able to communicate the reasons why it's not invalid to my UI layer.
Just to get your juices flowing, some ways of solving this could be: (Some of these solutions, in my opinion, suck, but I'm just putting them there so you get an understanding of what I'm trying to solve)
Instead of AddPerson returning a boolean, it can return an int (i.e. 0 = Success, Non Zero equals failure and the number indicates the reason why it failed.
In AddPerson, throw custom exceptions when validation fails. Each type of custom exception would have its own error message. In addition, each custom exception would be unique enough to catch in the UI layer
Have AddPerson return some sort of custom class that would have properties indicating whether validation passed or failed, and if it did fail, what were the reasons
Not sure if this can be done in VB or C#, but attach some sort of property to the Person and its underlying properties. This "attached" property could contain things like validation info
Insert your idea or pattern here
And maybe another here
Apologies for the long winded question, but I definately like to hear your opinion on this.
Thanks!
Multiple layers of validation go well with multi-layer apps.
The UI itself can do the simplest and quickest checks (are all mandatory fields present, are they using the appropriate character sets, etc) to give immediate feedback when the user makes a typo.
However the business logic should have the lion's share of validation responsibilities... and for once it's not a problem if this is "repetitious", i.e., if the business layer re-checks something that should already have been checked in the UI -- the BL should check all the business rules (this double checks on UI's correctness, enables multiple different UI clients that may not all be perfect in their checks -- e.g. a special client on a smart phone which may not have good javascript, and so on -- and, a bit, wards against maliciously hacked clients).
When the business logic saves the "validated" data to the DB, that layer should perform its own checks -- DBs are good at that, and, again, don't worry about some repetition -- it's the DB's job to enforce data integrity (you might want different ways to feed data to it one day, e.g. a "bulk loader" to import a number of Persons from another source, and it's key to ensure that all those ways to load data always respect data integrity rules); some rules such as uniqueness and referential integrity are really best enforced in the DB, in particular, for performance reasons too.
When the DB returns an error message (data not inserted as constraint X would be violated) to the business layer, the latter's job is to reinterpret that error in business terms and feed the results to the UI to inform the user; and of course the BL must similarly provide clear and complete info on business rules violation to the UI, again for display to the user.
A "custom object" is thus clearly "the only way to go" (in some scenarios I'd just make that a JSON object, for example). Keeping the Person object around (to maintain its "validation problems" property) when the DB refused to persist it does not look like a sharp and simple technique, so I don't think much of that option; but if you need it (e.g. to enable "tell me again what was wrong" functionality, maybe if the client went away before the response was ready and needs to smoothly restart later; or, a list of such objects for later auditing, &c), then the "custom validation-failure object" could also be appended to that list... but that's a "secondary issue", the main thing is for the BL to respond to the UI with such an object (which could also be used to provide useful non-error info if the insertion did in fact succeed).
Just a quick (and hopefully helpful) comment: when you're wondering where to place validation, try pretending that, soon, you're going to completely recreate your UI layer using a technology you're not yet so familiar with**. Try to keep out of that layer any validation-like business logic that you know for certain you'd have to rewrite in the new technology.
You'll find exceptions - business logic that ends up in your UI layer regardless, but it's a useful consideration nonetheless.
** Mobile dev, Silverlight, Voice XML, whatever - pretending you don't know the technology of your "new" UI layer helps you abstract your concerns and get less mired in implementation details.
The only important points are:
From the perspective of the front-end(s), the Middle Tier must perform all validation, you never know whether someone is going to try circumventing your front-end validation by talking directly to your Middle Tier (for whatever reason)
The Middle Tier may elect to delegate some of that validation to the DB layer (e.g. data integrity constraints)
You may optionally duplicate some validation in the UI, but that should only be for the sake of performance (to avoid round-trips to the Middle Tier for common scenarios, such as missing mandatory fields, incorrectly formatted data, etc.) These checks should never take the place of doing them in the Middle Tier
Validation should be done at all three levels.
When I am in a project I assume I am making a framework, which most of the time is not the case. Each layer is separate and must check all layers input before doing an operation
Each level can have a different way of doing it, it is not necessary they all use the same, but ideally they should all use the same validation with the ability to customize it.
You never want to let bad data into the database. So you can never trust the data you are getting from the business layer. It needs to be checked.
In the business layer you can never trust the UI layer, and you must check it to prevent un-needed calls to the database layer. The UI layer works the same way.
I disagree with David Basarab's comment that the same validations should be present in all layers. This defies the paradigm of responsibility of layers for one reason. Secondly, though the main intention is to make the layers (or components) loosely coupled, it is also important that a level of responsibility (and hence trust) is endowed on the layers. Though it might be necessary to duplicated some validations in UI and Business Layer (since UI layer can be bypassed by hacking attempts), however, it is not advisable to repeat the validations in each layer. Each layer should perform only those validations which they are responsible for. The biggest flaw in repeting validations in all layers is code redundancy, which can cause maintenance nightmare.
A lot of this is more style than substance. I personally favor returning status objects as a flexible and extensible solution. I would say that I think there are a couple classes of validation in play, the first being "does this person data conform to the contract of what a person is?" and the second being "does this person data violate constraints in the database?" I think the first validation can, and should be done at the client. The second should be done at the middle tier. With this division, you may find that the only reasons the save could fail are 1)violates a uniqueness constrains, or 2)something catastrophic. You could then return false for the first case, and throw an exception for the other.
If tier R is closer to the user (or any input stream you don't control) than tier S then tier S should validate all data received from tier R. This does not mean that tier R shouldn't validate data. It's better for the user if the GUI warns him he's making a mistake before he attempts a new transaction. But no matter how bulletproof the validation in your GUI is, the next tier up should not trust that any validation has taken place.
This assumes your database in completely under your control. If not, you have bigger problems.
Also, you could have the UI pass the data needed to build a Person object through some sort of PersonBuilder object, so that object creation is consolidated in the domain/business layer, and you can keep the Person object in a state that is always consistent. This makes more sense for more complex entities, however even for simple ones, it is good to centralize object creation, just like you centralize persistence, etc.

Resources