How to ensure data integrity with domain that change - validation

I'm working on a project where I applied DDD principles.
In order to ensure domain integrity I validate each domain model (entities or value objects) on creation.
Example of the user entity:
class User {
constructor(opts) {
this.email = opts.email;
this.password = opts.password;
this.validate();
}
validate() {
if(typeof this.email !== 'string') {
throw new Error('email is invalid');
}
if(typeof this.password !== 'string') {
throw new Error('password is invalid');
}
}
}
The validate method is stupid implementation of validation (I know I should verify email using Regex and I handle the error in a most effective way).
This model is then persisted using the the userRepository module.
Now, imagine I want to add a new property username to my user model, my validate method will look like this:
validate() {
if(typeof this.email !== 'string') {
throw new Error('email is invalid');
}
if(typeof this.password !== 'string') {
throw new Error('password is invalid');
}
if(typeof this.username !== 'string') {
throw new Error('username is invalid');
}
}
The problem is that old user models stored will not have the username property which is now required. Therefore when I'll fetch data from database and try to construct model it'll throw an error.
To fix this problem I see multiple solutions (but none seems good to me):
create an anti-corruption layer in the user repository (create default username if not defined)
Allow invariant in my domain model (username is not required)
Use cron-services that update database entities based on the domain change (again set default username)

The problem is that old user models stored will not have the username property which is now required.
Yup, that's a problem.
Here's how I think about it -- the persisted copy of your domain model is a message, sent by an instance of your domain model running in the past to an instance of your domain model running in the future.
If you want those messages to be compatible, then you need to accept certain constraints in the design of your message schema.
One of those constraints is that you don't add new required fields to an existing message type.
Adding optional fields is fine, because systems that don't care can ignore the optional fields, and the systems that do care can provide a default value of when the field is missing.
But if you need to add a new required field, then you create a new message.
The event sourcing community has to worry about this sort of thing a lot (events are messages); Greg Young wrote Versioning in an Event Sourced System, which has good lessons on the versioning of messages.
To fix this problem I see multiple solutions (but none seems good to me)
I agree, these are all kind of lousy - in the sense that they are all introducing a mechanism for deriving a "default" user name where none exists. That being the case, the field is effectively optional; so why claim that it is required?
In a situation where the field isn't required, but you want to stop accepting new data that doesn't include this field -- you probably want to put new validation on the data input code path. That is to say, you can create a new API with messages that require the field, validate those messages, and then use the domain model with the optional field to store and fetch the data.
So adding a new required field is an anti-pattern in DDD
Adding new required fields is an anti-pattern in messaging; DDD has little to do with it.
You shouldn't be expecting to be able to add required fields to existing schema in a backwards compatible way. Instead, you extend the message schema by introducing a new message in which the field is required.
I thought applying DDD principles help to handle the business logic complexity and also help to design evoluting software and evoluting domain models
It does, but it isn't magic. If your new models aren't backward compatible with the old models, then you are going to have to manage that change in some way
You might declare bankruptcy, and simply forget all previous history.
You might migrate your existing data to the new data model.
You might maintain the two different data models in parallel.
In other words, backwards compatibility is a long term concern that you should be thinking about as you design your solution.

Related

EF Core 2.1 trying to include primary key field in INSERT query when adding to DbContext and saving

In an ASP .Net Core 2.1 Web API (with a MySQL database and using Pomelo), when I add a new entity to the database in one of my controller actions, if the entity that is received by the API from the consuming client has a value in the primary key, it appears as though EF Core is trying to add the primary key instead of allowing the database to give it a new value.
So... in the database, I have a table called person which has an integer field called id which is set to PRIMARY KEY and AUTO-INCREMENT.
Model:
public partial class Person
{
public int? Id { get; set; }
public string Name { get; set; }
public string Surname { get; set; }
}
DbContext:
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.Entity<Person>(entity =>
{
entity.ToTable("person");
entity.HasKey(e => e.Id);
entity.Property(e => e.Id)
.HasColumnName("id")
.HasColumnType("int(11)");
entity.Property(e => e.Name)
.HasColumnName("name")
.HasColumnType("varchar(45)");
entity.Property(e => e.Surname)
.HasColumnName("surname")
.HasColumnType("varchar(45)");
}
}
Controller Action
// POST: api/Person
[HttpPost]
public async Task<IActionResult> AddPerson([FromBody]Person person)
{
if (!ModelState.IsValid)
return BadRequest(ModelState);
_context.Person.Add(person);
await _context.SaveChangesAsync();
return CreatedAtAction("GetPerson", new { id = person.Id }, person);
}
If I don't specifically clear the Id of the person before trying to insert it into the database (i.e. person.Id = null) then I get an exception complaining about duplicate primary key. Is this normal EF Core behavior? Or am I doing something wrong?
Frankly, yes, you are doing something wrong. For a whole host of reasons, you should never ever save an instance created from user input (i.e. the Person instance being passed into your action and created from the request body of the post) directly to your database. One such reason is that it causes havoc with ORMs like EF, which employ entity tracking to optimize queries.
Simply, this Person instance here is untracked - EF knows nothing about it. You then use Add to add it to your context, which signals EF to start tracking it as a new thing. When you later save, EF, then dutifully issues an insert statement, but since an id is included in that insert, you get a primary key conflict. What you wanted instead was for EF to do an update, but it doesn't know it should.
There's ways you can technically fix this. For example, you could use Attach rather than Add. That merely blindly tells EF that this is something it should track, without necessarily communicating that it should do anything with it. If you make any modifications to this instance after it is tracked, EF will change its change to "modified" and you'll end up with an update statement being issued when you save. However, if you're not making any changes, but just saving it directly, you'll also need to explicitly set it's state to "modified" or EF will essentially do nothing. The nice thing is that if you change the state on an untracked entity, then EF automatically attaches it to track said state, so you you don't need to do Attach manually. Long and short, you can clear the exception merely by replacing your Add line with:
_context.Entry(person).State = EntityState.Modified;
However, that then will cause a problem if you try to add a new person entirely. A bigger issue you have here is that you have one action doing double duty. According to REST, a POST is not replayable and should only be made to resources which or idempotent. Put more simply, you POST only to a resource like /api/person (rather than something like /api/person/1 and every time you do so a new person should be created. For an update, you should make a request to that actual resource, i.e /api/person/1 and the HTTP verb should be PUT, instead. The same PUT request to the same resource will always have the same result, which is the case for an update to a particular resource.
Theory aside, the simple point is that you should have two actions:
[HttpPost("")]
public async Task<IActionResult> AddPerson([FromBody]Person person)
[HttpPut("{id}")]
public async Task<IActionResult> UpdatePerson(int id, [FromBody]Person person)
Finally, even with all this, saving the person param directly puts too much trust in the user, when doing an update. There might be any number of properties an end-user should not be able to modify with an update (such as something like a "created" date, for example), but they can when you do this. In some ways worse, even if the user is not being malicious, you're still relying on them to post all the data for that entity. For example, if you did have a created date property, but the user doesn't post that with their update (honestly, why would you post a created date along with a request to update a resource), then it will have the effect of clearing that property out. If there's a default, it will be set back to that, and if not, you may actually get an exception on saving, if the column is NOT NULL.
Long and short, it's not a good idea. Instead, use a view model, DTO, or similar. That class should contain only properties you want to allow a user to modify or even to affect on create in the first place. Then, for the case of an update, you pull the resource fresh from the database, and map over the values from your param instance onto that. Finally, you save the version from the database back to the database. This ensures 1) the user cannot modify anything you do not explicitly allow, 2) the user only needs to post things they actually care about modifying, and 3) the entity will be properly tracked and EF will issue an update statement correctly on save.

Web application's form validation - design to propagate domain errors to client-side?

Data validation should occur at the following places in a web-application:
Client-side: browser. To speed up user error reporting
Server-side: controller. To check if user input is syntactically valid (no sql injections, for example, valid format for all passed in fields, all required fields are filled in etc.)
Server-side: model (domain layer). To check if user input is domain-wise valid (no duplicating usernames, account balance is not negative etc.)
I am currently a DDD fan, so I have UI and Domain layers separated in my applications.
I am also trying to follow the rule, that domain model should never contain an invalid data.
So, how do you design validation mechanism in your application so that validation errors, that take place in the domain, propagate properly to the client? For example, when domain model raises an exception about duplicate username, how to correctly bind that exception to the submitted form?
Some article, that inspired this question, can be found here: http://verraes.net/2015/02/form-command-model-validation/
I've seen no such mechanisms in web frameworks known to me. What first springs into my mind is to make domain model include the name of the field, causing exception, in the exception data and then in the UI layer provide a map between form data fields and model data fields to properly show the error in it's context for a user. Is this approach valid? It looks shaky... Are there some examples of better design?
Although not exactly the same question as this one, I think the answer is the same:
Encapsulate the validation logic into a reusable class. These classes are usually called specifications, validators or rules and are part of the domain.
Now you can use these specifications in both the model and the service layer.
If your UI uses the same technology as the model, you may also be able to use the specifications there (e.g. when using NodeJS on the server, you're able to write the specs in JS and use them in the browser, too).
Edit - additional information after the chat
Create fine-grained specifications, so that you are able to display appropriate error messages if a spec fails.
Don't make business rules or specifications aware of form fields.
Only create specs for business rules, not for basic input validation tasks (e.g. checking for null).
I want to share the approach used by us in one DDD project.
We created a BaseClass having fields ErrorId &
ErrorMessage.
Every DomainModel derive from this BaseClass & thus have a two extra fields ErrorId & ErrorMessage available from
BaseClass.
Whenever exception occurs we handle exception(Log in server, take appropriate steps for compensating logic & fetch User Friendly message from client location based localized Resource file for message ) then propagate data as simple flow without raising or throwing exception.
At client side check if ErrorMessage is not null then show error.
It's basic simple approach we followed from start of project.
If it's new project this is least complicated & efficient approach, but if you doing changes in big old project this might not help as changes are big.
For validation at each field level, use Validation Application Block from Enterprise Library.
It can be used as :
Decorate domain model properties with proper attributes like:
public class AttributeCustomer
{
[NotNullValidator(MessageTemplate = "Customer must have valid no")]
[StringLengthValidator(5, RangeBoundaryType.Inclusive,
5, RangeBoundaryType.Inclusive,
MessageTemplate = "Customer no must have {3} characters.")]
[RegexValidator("[A-Z]{2}[0-9]{3}",
MessageTemplate = "Customer no must be 2 capital letters and 3 numbers.")]
public string CustomerNo { get; set; }
}
Create validator instance like:
Validator<AttributeCustomer> cusValidator =
valFactory.CreateValidator<AttributeCustomer>();
Use object & do validation as :
customer.CustomerNo = "AB123";
customer.FirstName = "Brown";
customer.LastName = "Green";
customer.BirthDate = "1980-01-01";
customer.CustomerType = "VIP";
ValidationResults valResults = cusValidator.Validate(customer);
Check Validation results as:
if (valResults.IsValid)
{
MessageBox.Show("Customer information is valid");
}
else
{
foreach (ValidationResult item in valResults)
{
// Put your validation detection logic
}
}
Code example is taken from Microsoft Enterprise Library 5.0 - Introduction to Validation Block
This links will help to understand Validation Application Block:
http://www.codeproject.com/Articles/256355/Microsoft-Enterprise-Library-Introduction-to-V
https://msdn.microsoft.com/en-in/library/ff650131.aspx
https://msdn.microsoft.com/library/cc467894.aspx

How do I do cross-entity server-side validation

In my application, I have cross-entity validation logic that requires me to look at the entire change set and I'm doing this using the BeforeSaveEntities override.
I can construct the right logic by examining the saveMap parameter, but what am I supposed to do if I find something invalid?
If I throw an exception, like I would for single entity validation in the BeforeSaveEntity override, the whole save is aborted and the error is reported to the client. But some of the entities might be valid so I would want to save those and only abort the invalid parts.
Because BeforeSaveEntities returns a saveMap, I think I should be able to remove the invalid entities from the change set and continue to save the valid entities, but then how do I report the invalid parts to the client?
Is it possible to do a partial save of only the valid entities and at the same time, report a sensible error to the client to describe the parts of the save that failed?
Jay told you the way it is.
I wouldn't hold my breath waiting for Breeze to change because I think yours is a rare scenario and it isn't one we would want to encourage anyway.
But I'm weird and I can't stop thinking what I'd do if were you and I absolutely HAD to do it. I might try something like this.
Warning: this is pseudo-code and I'm making this up. I do not recommend or warrant this
Create a custom MyCustomEFContextProvider that derives from EFContextProvider.
Give it an ErrorEntities property to hold the error object
Override (shadow) the SaveChanges method with another that delegates to the base
public new CustomSaveResult SaveChanges(JObject saveBundle,
TransactionSettings transactionSettings = null) {
var result = base.SaveChanges(saveBundle, transactionSettings);
// learn about CustomSaveResult below
return new CustomSaveResult(this.ErrorEntities, result);
}
Catch an invalid entity inside BeforeSaveEntities
Pass it with error message to your custom ErrorEntities property
You get to that property via the EntityInfo instance as in
((MyCustomEFContextProvider) info.ContextProvider).ErrorEntities.Add(new ErrorEntity(info, message));
Remove the invalid entity from the SaveMap so it won't be included in the actual save
Let the save continue
The second line of your override SaveChanges method creates a new instance of your CustomSaveResult from the standard one and returns that to the caller.
public class CustomSaveResult : SaveResult {
public List ErrorEntities;
public CustomSaveResult(List errorEntities, SaveResult result){
// copy over everything
this.Entities = result.Entities;
this.KeyMappings = result.KeyMappings;
this.Errors = this.Errors;
// and now your error stuff
this.ErrorEntities = errorEntities;
}
}
Let's assume the caller is your Web API controller's SaveChanges method. Well you don't have to change a thing but you might make it clear by explicitly returning your custom SaveResult:
readonly MyCustomEFContextProvider _contextProvider = new MyCustomEFContextProvider();
...
[HttpPost]
public CustomSaveResult SaveChanges(JObject saveBundle) {
return _contextProvider.SaveChanges(saveBundle);
}
JSON.Net will happily serialize the usual material + your custom ErrorEntities property (be sure to make it serializable!) and send it to the Breeze client.
On the Breeze client you write your own variation on the stock Breeze Web API data service adapter. Yours does almost exactly the same thing as the Breeze version. But, when processing the save payload from the server, it also extracts this extra "error entities" material in the response and does whatever you want to do with it.
I don't know what that will be but now you have it.
See how easy that was? LOL.
Breeze does not currently support a save mechanism that both saves and returns an error at the same time. While possible this seems a bit baroque.
As you pointed out, you can
1) Throw an exception inside of the BeforeSaveEntities and fail the save. You can even specify which specific entity or entities caused the failure and why. In this case the entire save is aborted.
or
2) Remove 'bad' items from the saveMap within the BeforeSaveEntities and save only a subset of what was passed in. In this case you are performing a partial save.
But we don't support a hybrid of these two. Please add this to the Breeze User Voice if you feel strongly and we can see if other members of the community feel that this would be useful.

should a validation function access the repository directly?

I have the following in my application:
Action Orm entity (From telerik open access)
Repository(Of Action)
AppService(Holds an instance of the repository)
when I need to save an instance, I send the instance to the AppService. the AppService then calls a validator to validate the instance to save. the validator is based on http://codeinsanity.com/archive/2008/12/02/a-framework-for-validation-and-business-rules.aspx
(full code on https://github.com/riteshrao/ncommon)
so basically my save function in the AppService looks like this
Public Sub AddAction(ByVal Item As Data.Model.Action)
Contract.Requires(Of ArgumentNullException)(Item IsNot Nothing, "Item is nothing.")
Dim validateResult As Rules.ValidationResult = _ActionValidator.Validate(Item)
If Not validateResult.IsValid Then
Throw New Validation.ValidationException(validateResult)
End If
Try
_ActionRepository.Add(Item)
_unitOfWork.SaveChanges()
Catch ex As Exception
_unitOfWork.ClearChanges()
Throw New DataServiceException(ex.Message, ex)
End Try
End Sub
for checking properties of the Action item the sample code works great. my question begins when i need to make sure that the same action is not added twice to the DB for the same customer (ie. id is difference, name is the same and customer is the same)
as I see it I have a few options:
option 1: check for a duplicate action using something like
function(validatedItem) item.Customer.Actions.Any(function(item) item.id<>validatedItem.id andalso item.name=validatedItem.name))
basically I go from the action being saved back to the customer and then back to all his actions and check if any action exists with a different id and same name
the down sides are:
a. for this to work, when accessing the customer property of the item, the entire customer object is read from DB which is redundant in this case
b. the Any function is being evaluated on the client as item.Customer.Actions returns IList(Of Action)
Option 2: let the validation class have access to the action repository. then i could simply do something like
'assume I already have validatedItem
repository.Any(function(item) item.id<>validatedItem.id and item.customerid=validatedItem.customerid and item.name=validatedItem.name)
this will result in an Exists query being sent to the DB but the downside(?) is that the validation framework should not access the repository directly (as far as I have seen in the very few examples i could find that do use validation and ORM)
Option 3: let the validation class have access to the AppService and use the AppService to check for existence of a duplicate.
problems:
a. I create a circular reference (AppService->Validation Class->AppService)
b. I need to create a lot of useless functions in the AppService for loading items based on criteria that is only relevant for the validation
Any ideas what is the best course here?
The simplest is not to check duplicates in the database from your domain.
When a collection of entities is part of you aggregate then it is a non-issue since you would not permit the duplicate to be added to the collection. Since the aggregate is stored as a whole you would never run into the problem.
For scenarios where you do not want a duplicate, say, e-mail address and no collection of the entities is represented by an aggregate (such as the Users in a system) you can just let the database enforce the uniqueness. Simply pick up the exception and report back. In many instances your validation would not be able to enforce the uniqueness simply because it doesn't have/implement the required locks that a database system would have.
So I'd simply leave that up to the database.

Spring JSON merge

I'm updating a record from a form over AJAX. I have a JSON object that maps to my entity, and my controller method is:
#RequestMapping(value = "/vendors", method = RequestMethod.POST)
public ResponseEntity<String> saveVendorsJson(#RequestParam String vendor) {
Vendor v = Vendor.fromJsonToVendor(vendor);
if (v.merge() == null) {
v.persist();
}
return new ResponseEntity<String>(HttpStatus.OK);
}
I expected from the documentation that v.merge() will return null if it didn't find an existing record by the object's 'id' field to merge with, and in that case I want to persist it as a new Vendor object.
What's happening is, despite my JSON having an 'id' field value that matches an existing record, I'm ALWAYS inserting a new record with my updated goods from the browser.
I'm aware I'm having the POST method pull double-duty here, which isn't strictly RESTful. In theory, this is simpler for me (though of course that's turning out not to be the case).
I believe this is a Hibernate thing. Hibernate will not "merge" if it doesn't know it already exists. I think what I have done in the past is to do a lookup, then a persist. I think if you try to just merge something coming in off the wire you would get a Primary Key Collision, or something similar. I believe Hibernate has some sort of "dirty" flag internal to indicate if you are creating or editing an existing object.
There also used to be a way in raw-Hibernate to do a soft-lookup, basically tell Hibernate "look, I have this object, I don't want to do a SELECT blah-blah-blah, I just want to update some fields". It would load the object into Cache and allow you to do the update without doing the SELECT first. There is also an updateOrSave() in Spring, but that actually does the SELECT first.

Resources