Should my repository enforce validity of data, if so, where and how? - validation

I have a repository data access pattern like so:
IRepository<T>
{
.. query members ..
void Add(T item);
void Remove(T item);
void SaveChanges();
}
Imagine the scenario where I have a repository of users, users have a user name which is unique, if I create a new user with a username that exists (imagine I have a dumb ui layer that doesn't check), when I add it to the repository, all is fine.. when I hit SaveChanges, my repository attempts to save the item to the database, my database is enforcing these rules luckily and throws me back an aborted exception due to a unique key violation.
It seems to me that, generally this validation is done at the layer ABOVE the repository, the layers that call it know they should ensure this rule, and will pre-check and execute (hopefully in some kind of transaction scope to avoid races, but doesn't always seem possible with the medium ignorance that exists).
Shouldn't my repository be enforcing these rules? what happens if my medium is dumb, such as a flat database without any integrity checks?
And if the repository is validating these kind of things, how would they inform callers about the violation in a way that the caller can accurately identify what went wrong, exceptions seem like a poor way to handle this because their relatively expensive and are hard to specialize down to a specific violation..
I've been playing around with a 'Can' pattern, for example.. CanAdd with Add, add will call CanAdd and throw an invalid operation exception if can returns a violation.. CanAdd also returns a list of violations about what went wrong.. this way I can start to stack these routines through the stack.. for example, the service layer above would also have a 'Can' method that would return the repositories report + any additional violations it wanted to check (such as more complicated business rules, such as which users can invoke specific actions).
Validation of data is such a fundamental yet I feel there is no real guidance for how to handle more advanced validation requirements reliably.
Edit, additionally in this case, how do you handle validation of entities that are in the repository and are updated via change tracking.. for example:
using (var repo = ...)
{
var user = repo.GetUser('user_b');
user.Username = 'user_a';
repo.SaveChanges(); // boom!
}
As you could imagine, this will cause an exception.. going deeper down the rabbit hole, imagine I've got a validation system in place when I add the user, and I do something like this:
using (var repo = ...)
{
var newUser = new User('user_c');
repo.Add(newUser); // so far so good.
var otherUser = repo.GetUser('user_b');
otherUser.Username = 'user_c';
repo.SaveChanges(); // boom!
}
In this case, validating when adding the user was pointless, as 'downstream' actions could screw us up anyway, the add validation rule would need to check the actual persistence storage AND any items queued up to persist.
This still doesn't stop the previous change tracking problem.. so now do I start to validate the save changes call? it seems like there would be a huge amount of violations that could happen from aparently unrelated actions.
Perhaps I'm asking for an unrealistic, perfect safety net?
Thanks in advance,
Stephen.

The ideal rule is that each of your layers should be a black box and none of them should depend on validation of another layer. The reason behind this is that the DB has no idea of the UI and vice versa. So when the DB throws an exception, the UI must have DB knowledge (bad thing) to convert that into something the UI layer can understand, so it can eventually convert it into something the user can understand. Ugh.
Unfortunately, making validation on every layer is also hard. My solution: Either put the validation in a single place (maybe the business layer) and make the other layers really dumb. They don't check anything elsewhere.
Or write your validation in an abstract way into the model and then generate all validation from that. For example:
String name;
Description nameDesc = new Description("name",
new MaxLength(20), new NotNull());
This way, you can write code which examines the Description stuff (generate code or even at runtime) and do the validation in each layer with little cost because one change fixes all layers.
[EDIT] For validation, you only have these cases:
Duplicate key
Above some limit
Below some limit
Null (not specified)
Empty
Formatting error (date fields, etc)
So you should be able to get away with these exception classes which have object, field, old&new value plus special info like the limit that was hit. So I'm wondering where your many exception classes come from.
For your other question, this is ... uh ... "solved" by the two phase commit protocol. I say "solved", because there are situations when the protocol breaks down and in my experience, it's much better to give the user a "Retry?" dialog or some other means to fix the problem rather than investing a lot of time into TPC.

Related

Entity/Domain purety dilemma in the clean architecutre/Domain driven design

Im working on a eCommerce system in which I try to implement the clean architecture.
But currently Im stuck a little bit.
So I have a use case called: CreateItemUseCase in which I create a Item (alias product) for the shop.
In this use case I call a method (createItemEntity()) of a Entity called ItemEntity.
This method creates just a data object with data like:
userId
itemTitle
itemDescription
...
So now I need another method in the ItemEntity which validates the userId.
To create a Item the user needs to have a userId so the method in the ItemEntity would be called:
validateUserId()
This method should check if the user has a userId in the database and if not the Item creation would be imposible.
Now my question:
How do I validate the userId?
Should I have the validateUserId() method take a array as a parameter, In which all the User Id´s are saved... something like this:
validateUserId(toBeValidated: Int, allUserIds: Array[Int])
{
// loop through the allUserIds to see if toBeValidated is in there ...
}
Or should I query the data in the method (which Im pretty sure, would violate the dependencie rule) like this:
validateUserId(toBeValidated: Int)
{
// get all user id´s through a query, and check if toBeValidated is in there ...
}
Or should I do it completly different?
In general, entities should only contain logic that is operating on information (data) that is within the entity's scope. Knowing how to query if a user with a certain user id exists or not is not in the scope of the item entity.
I think your motivation to keep all the logic for validation together is reasonable but on the other hand you should not introduce infrastructure dependencies (like talking to the database or user repository) to the entity. Knowing how to query if a user with a certain user id exists or not is not in the scope of the item entity.
Or should I query the data in the method (which Im pretty sure, would violate the dependencie rule) like this
Exactly, that's why it's usually best trying to avoid that to keep entities free from such dependencies. Introducing such dependencies can easily get out of hand and also increase complexity for testing such entities. If you need to do that it should be a very thought decision that justifies that.
Should I have the validateUserId() method take a array as a parameter, In which all the User Id´s are saved... something like this
This is not such a bad idea in general, because you would not make the entity dependent on infrastructure and provide the entity with all the data it needs for decision making. But on the other hand now you can run into another problem: bad performance.
Now you would retrieve all user ids everytime you create an item. If you would do the check for the user's existence somewhere else this can be optimized much better.
I suggest to ask the user repository beforehand if the user exists prior to performance the entity creation including all the other potentially required validations inside item entity that make sense there. The user repository could have a query that optimizes for just checking for the existence of this user by id.
In case these two operations (asking for the user's existence and creating the new item) only happen at one place of the application I'd be pragmatic and perform the user existence check directly in the use case. If this would occur from different places in your application you can extract that logic into a separate (domain) service (e.g. item service) which deals with the repetitive flow operations working with the user repository and item entity.
What you are dealing here with is a trade-off decision between domain model purity, domain model completeness and performance considerations. In this great blog this is named the Domain-Driven Design Trilemma. I suggest going through the reasoning in the article, I'm pretty sure it will help you coming to a final decision.
I think this is one of side case of what we call Business Gerunds
Details: https://www.forbes.com/sites/forbestechcouncil/2022/05/19/10-best-practices-for-event-streaming-success/
If Item has to validate the user, just see what common attributes are there between entities and who is responsible for change of those, and then a segregation can be done in DDD representation, and using a composite via transaltion, outside world entities can exist as the same

Is there any way in Entity Framework 4 of making Validation produce a warning instead of an error?

As far as I can see, the validation within Entity Framework is built entirely around the assumption that, if an item fails its validation, it must not be persisted to the database. Is there any mechanism, possibly running parallel to normal validation, of making a constraint on a field produce a warning to the user, rather than an error which prevents the record from being saved/updated?
To be more specific, I have a situation where a particular numerical field has limits on it, but these are advisory rather than hard-and-fast. If the user enters a value outside these limits, they should get a warning, but should still be able to save the record.
In theory, I could subclass the ValidationResult class to make, say, a ValidationWarning class, then create a custom subclass of ValidationResults whose IsValid property was sensitive to the presence of ValidationWarning messages, and ignored them in deciding whether the entity is valid. However, this requirement has arisen in a project which is already someway along in its development, and it would require a lot of refactoring to make this kind of custom subclassing work properly. I would prefer to find a mechanism which could be levered in without creating that much disruption/rework.
I had a similar requirement on a project and how I solved it was this. If (ModelState.IsValid) is false, I cleared out the errors out of the ModelState and sent it on its way again,then logged the "error" to another service. This is a bit of a hack and I would'nt recommend doing as it is not exactly best practice.

How do I avoid duplicating validation logic between the domain and application layers?

Any given entity in my domain model has several invariants that need be enforced -- a project's name must be at least 5 characters, a certain product must exist to be associated with the project, the due date must not be prior to the current date and time, etc.
Obviously I want the client to be able to display error messages related to validation, but I don't want to constantly maintain the validation rules between several different layers of the program -- for example, in the widget, the controller, the application service or command object, and the domain. Plus, it would seem that a descriptive error message is presentation-related and not belonging to the domain layer. How can I solve these dilemmas?
I would create specific exceptions related to your expected error conditions. This is standard for Exception handling in general and will help with your issue. For example:
public class ProjectNameNotLongEnoughException : System.Exception
or
public class DueDatePriorToCurrentDateException : System.Exception
Mark these possible exceptions in the xml comments for the methods that may throw them so that applications written against your domain model will know to watch out for these exceptions and will be able to present a message within the presentation of the application. This also allows you to have localized error messages based on the culture without cluttering up your domain model with presentation concerns.
If you choose to perform client-side validation, I'm afraid that you can't have your cake and eat it too. In this case, you may have to duplicate validation logic in order to achieve the desired features while maintaining your architecture.
Hope this helps!
I realise this is an old question, but this may help others in a similar situation.
You have here Behavior and Conditions which you need to encapsulate into your domain model.
For example, the ProjectName having a requirement on a certain length I would suggest should be encapsulated within a ValueObject. It may seem overboard for some, but within our Domain Model we almost always encapsulate native types, especially String, within a ValueObject. This then allows you to roll your validation within the constructor of the ValueObject.
Within the Constructor you can throw an Exception relating to the violation of the parameters passed in. Here is an example of one of our ValueObjects for a ZoneName:
public ZoneName(string name)
{
if (String.IsNullOrWhiteSpace(name))
{
throw new ArgumentNullException("Zone Name is required");
}
if (name.Length > 33)
{
throw new ArgumentException("Zone name should be less than 33 characters long");
}
Name = name;
}
Now consumers of that ValueObject can either perform their own validation before calling the constructor, or not, but either way your invariants will be consistent with your model design.
One way we build validation rules within your Domain Model, and then utilise them within your UI is to use the Mediatr module, which uses a One Model In, One Model Out pattern, and allows you to define Validators for each of your Query or Command models. These are defined using FluentValidation. You can then add a Provider to the ModelValidatorProviders within MVC. Take a look at JBogards ContosoUniversity example here https://github.com/jbogard/ContosoUniversity/tree/master/src/ContosoUniversity and look at the DependancyResolution folder, DefaultRegistry.cs.
Your other example of a Product must exist to be linked to a Project. This sounds to me like a Domain Service would be the best option to facilitate the cooperation between 2 bounded contexts? The Domain Service will ensure the invariants remain consistent across the bounded contexts. That Domain Service would not be exposed publically, so you would need an ApplicationService or a CQRS type interface which will take that DomainService as a dependency, allowing the DomainService to perform the operations required. The DomainService should contain the Domain Behavior, whereas the Application Service should just be a facilitator to call that function. Your DomainService would then throw exceptions rather than result in inconsistent or invalid invariants.
You should ultimately end up in a position where you don't have duplicated validation, or at least you never end up with invalid invariants because validation has not been performed at some point, as validation is always handled within your domain model.
While a descriptive error message may seem to pertain to presentation moreso than business, the descriptive error message actually embodies a business rule contained within the domain model -- and when throwing an exception of any kind, it is best practice to pass along some descriptive message. This message can be re-thrown up the layers to ultimately display to the user.
Now, when it comes to preemptive validation (such as a widget allowing the user to only type certain characters or select from a certain range of options) an entity might contain some constants or methods that return a dynamically-produced regular expression which may be utilized by a view model and in turn implemented by the widget.

Static? Repositories MVC3, EF4.2 (Code First)

I'm new to MVC, EF and the like so I followed the MVC3 tutorial at http://www.asp.net/mvc and set up an application (not yet finished with everything though).
Here's the "architecture" of my application so far
GenericRepository
PropertyRepository inherits GenericRepository for "Property" Entity
HomeController which has the PropertyRepository as Member.
Example:
public class HomeController
{
private readonly PropertyRepository _propertyRepository
= new PropertyRepository(new ConfigurationDbContext());
}
Now let's consider the following:
I have a Method in my GenericRepository that takes quite some time, invoking 6 queries which need to be in one transaction in order to maintain integrity. My google results yeldet that SaveChanges() is considered as one transaction - so if I make multiple changes to my context and then call SaveChanges() I can be "sure" that these changes are "atomic" on the SQL Server. Right? Wrong?
Furthermore, there's is an action method that calls _propertyRepository.InvokeLongAndComplex() Method.
I just found out: MVC creates a new controller for each request. So I end up with multiple PropertyRepositories which mess up my Database Integrity. (I have to maintain a linked list of my properties in the database, and if a user moves a property it needs 6 steps to change the list accordingly but that way I avoid looping through all entities when having thousands...)
I thougth about making my GenericRepository and my PropertyRepository static, so every HomeController is using the same Repository and synchronize the InvokeLongAndComplex Method to make sure there's only one Thread making changes to the DB at a time.
I have the suspicion that this idea is not a good solution but I fail to find a suitable solution for this problem - some guys say that's okay to have static repositories (what happens with the context though?). Some other guys say use IOC/DI (?), which sounds like a lot of work to set up (not even sure if that solves my problem...) but it seems that I could "tell" the container to always "inject" the same context object, the same Repository and then it would be enough to synchronize the InvokeLongAndComplex() method to not let multiple threads mess up the integrity.
Why aren't data repositories static?
Answer 2:
2) You often want to have 1 repository instance per-request to make it easier to ensure that uncommited changes from one user don't mess things up for another user.
why have a repository instance per-request doesn't it mess up my linked list again?
Can anyone give me an advice or share a best practice which I can follow?
No! You must have a new context for each request so even if you make your repositories static you will have to pass current context instance to each its method instead of maintaining single context inside repository.
What you mean by integrity in the first place? Are you dealing with transactions, concurrency issues or referential constraints? Handling all of these issues is your responsibility. EF will provide some basic infrastructure for that but the final solution is still up to your implementation.

How to validate in domain layer

I often see people validating domain objects by creating rule objects which take in a delegate to perform the validation. Such as this example": http://www.codeproject.com/KB/cs/DelegateBusinessObjects.aspx
What I don't understand is how is this advantageous to say just making a method?
For example, in that particular article there is a method which creates delegates to check if the string is empty.
But is that not the same as simply having something like:
Bool validate()
{
Result = string.IsNullOrEmpty(name);
}
Why go through the trouble of making an object to hold the rule and defining the rule in a delegate when these rules are context sensitive and will likely not be shared. the exact same can be achieved with methods.
There are several reasons:
SRP - Single Responsibility Principle. An object should not be responsible for its own validation, it has its own responsibility and reasons to exist.
Additionally, when it comes to complex business rules, having them explicitly stated makes validation code easier to write and understand.
Business rules also tend to change quite a lot, more so than other domain objects, so separating them out helps with isolating the changes.
The example you have posted is too simple to benefit from a fully fledged validation object, but it is very handy one systems get large and validation rules become complex.
The obvious example here is a webapp: You fill in a form and click "submit". Some of your data is wrong. What happens?
Something throws an exception. Something (probably higher up) catches the exception and prints it (maybe you only catch UserInputInvalidExceptions, on the assumption that other exceptions should just be logged). You see the first thing that was wrong.
You write a validate() function. It says "no". What do you display to the user?
You write a validate() function which returns (or throws an exception with, or appends to) a list of messages. You display the messages... but wouldn't it be nice to group by field? Or to display it beside the field that was wrong? Do you use a list of tuple or a tuple of lists? How many lines do you want a rule to take up?
Encapsulating rules into an object lets you easily iterate over the rules and return the rules that were broken. You don't have to write boilerplate append-message-to-list code for every rule. You can stick broken rules next to the field that broke them.

Resources