DDD - Life cycle of Value Objects: Validation & Persistence - validation

I understand what is VO (immutable, no-identity, ...). But I have several questions that are from discussion with my co-workers.
A) Validation - How precise should it be?
What types of validation should I put into VO? Only basic ones or all? Good example of VO is Email with some regexp validation. I've seen it many times. I've worked on several big/medium-size applications and regexp wasn't good enough because:
system-A: Domain name of email was validated, eg test#gmali.com in invalid email because domain gmali.com doesn't exist
system-B: We had list (service) of banned domains of "temporary email services" because we wanted to avoid of "fake accounts"
I cannot imagine to put validation of this kind into VO, because it require network communication and VO will be complicated (and slow).
B) Validation: Names, Titles, Strings... is length part of VO?
Programmer can use old-good string data type. I can image VO as NotEmptyString, but is it good approach to create value objects as:
FirstName (non-empty string with length limitation)
Surname (non-empty string with length limitation)
StreetName(non-empty string with length limitation)
There is no difference between FirstName and Surname, because in application we cannot find out if some one swap first name and surname in form. Robert can be first name and it can be also surname...
class Person
{
private string $firstName; // no VO for firstName
// or
private FirstName $firstName; // VO just for firstName & length validation
// or
private NotEmptyString $firstName; // VO - no max length validation
// or
private StringLength50 $firstName; // same as FirstName, different name for sharing
}
Which approach is the best and why?
C) Copy of VO: Providing "Type-Safety" for entity arguments?
This point is similar to previous one.
Is it good practice to create classes like this:
class Surname extends Name
{
}
class FirstName extends Name
{
}
just "to alias" VO?
D) Persistence: Reading stored VO
This point is closely related to first one: A) Validation - How precise should it be?. I strongly believe what is stored in my "storage engine" (DB) is valid - no questions. I don't see any reason why I should validate VO again when everything was validated during "persistence step". Even complex/poorly-written regexp could be performance killer - listing of N-hundreds on user emails.
I'm lost here... should I validate only basic stuff and use same VO during persist and read or should I have 2 separate VO for these cases?
E) Persistence/Admin: Something like "god" in the system.
From my experience: In real-word system user with higher privileges can sometimes by-pass validation rules and this is again related to point A) Example:
you (as regular user of system) can make online reservation up to 30 days from today
admin user can make online reservation to any date
Should I use only Date / FutureDate VO or what?
F) Persistence: Mapping to DB data-types
Is it good practice to closely bound VO and DB (storage engine) data types?
If FirstName can have only 50 chars should it be defined / mapped to VAR_CHAR(50)?
Thanks.

A) Validation - How precise should it be?
It's not about precision, it's about invariants & responsibility. A value object (VO) can't possibly have authority on whether or not an email address exists. That's a fact that varies and can't be controlled by the VO. Even if you had code such as the following:
var emailAddress = EmailAddress.of('some#email.com', emailValidityChecker);
The address may not exist a few minutes later, the user may have lost his account password forever, etc.
So what does EmailAddress should represent? It should ensure the "format" of the address makes it a usable & useful address in your domain.
For instance, in a system responsible for delivering tax reminders, I had a limitation where I had to use Exchange and it couldn't support certain email formats like addresses with "leading, trailing or consecutive dots in the local-part" (took the exact comment I had put).
Even though that's a technical concern in theory, that means our system couldn't ingest such email addresses and they were completely useless to us so the ValidEmailAddress VO did not accept those to fail early (otherwise it was generating false positives down the chain).
B) Validation: Names, Titles, Strings... is length part of VO?
I would, even though such lengths might sometimes feel somewhat arbitrary or infrastructure-driven. However, I think it's safe to say that a name with 500 characters is certainly a mistake. Furthermore, validating with reasonable ranges can protect against attacks (e.g. a 1GB name???). Some may argue that it's purely an infrastructure concern and would put the validation at another layer, but I disagree and I think the distinction is unhelpful.
The length rules aren't always arbitrary, for instance a TweetMessage that can't be longer than 280 chars, that's a domain rule.
Does that mean you must have a VO for every possible strings in the system? Honestly I pushed backed being scared to overuse VOs and edge towards a VO-obsession rather than primitive obsession, but in almost every scenario I wished I just took the time to wrap that damn string.
Be pragmatic, but I see more harm in underusing than overusing VOs.
C) Copy of VO: Providing "Type-Safety" for entity arguments?
I most likely wouldn't extend Name just for the sake of reuse here. There's most likely no place where you'd want to interchange a Surename with a FirstName so polymorphism is pretty useless too. However, the explicit types may help to interchange "surename" for "first name" and vice-versa.
Independently of whether or not the explicit types are useful, something more useful here might be to aggregate both under a FullName VO that creates increases cohesion.
Please beware that overly restrictive name policies has been a huge pain point for many international systems though...
D) Persistence: Reading stored VO
Persisted data lives on the "safe" side and should NOT be validated again when loaded into memory. You should be able to circumvent the validation path when hydrating our VOs.
E) Persistence/Admin: Something like "god" in the system.
VOs are great to enforce their "invariants". An invariant by definition doesn't vary given the context. That's actually something many misunderstood when saying "always-valid" approach doesn't work.
That said, even system admins most likely can't make new reservations in the past, so perhaps that can be an invariant of a ReservationDate. Ultimately you would most likely extract the other rules in the context to which they belong.
F) Persistence: Mapping to DB data-types
I think it's more important to reflect the DB limitation in the domain than inversely, reflect the domain limitation in the DB. If your DB only accepts 50 chars and you exceed that some systems will just crash with a very cryptic error message not even telling you which column overflowed. Validating in the domain would help debugging much more quickly. However, you do not necessarily have to match the domain rule in the DB.

DDD, like any other design, is a matter of drawing lines and making abstract rules. Some rules may be very strict, while others may be fluent to some extent. The important thing is to keep consistency as much as possible, rather than striving to build the ultimate-undefeatable domain.
Validation - How precise should it be?
"Heavy" validations should not occur inside VO. A VO is not very
different in its nature from the primitive it encapsulates, therefore
validations should be independent of external factors. Please recall that
even primitives such as byte may be internally validated: an exception (sometimes even a compile error) occurs when a byte variable is assigned with value greater than 255.
Advanced validations, on the other hand, belong to the flow part (use-case / interactor / command-handler), since they involve operations beyond the scope of the VO's primitive, such as querying databases or invoking APIs. You can, for example, query a list of valid email providers from database, check if VO's provider contained in list, and throw exception if not. This is simply flow.
You may, of course, decide to have an in-memory static list of email providers, in which case it will be perfectly valid to keep it inside VO and check its primitive against that list. No need to communicate with external world, everything is "local". Would it be scalable? probably not. But it follows a DDD rule stating that VO should not "speak" with external resources.
Validation: Names, Titles, Strings... is length part of VO?
VOs, much like other DDD concepts, should "speak out loud" your business domain, meaning that they should express business semantics. This is why FirstName, Surname and StreetName are good names, while NotEmptyString is less preferable due to the fact it communicates technical rather than business details.
If, for example, your business states that customers with a more-than-50-characters-length name are to be addressed differently than customers with a less-than-50-characters-length name, then you probably should have two VOs, e.g. LongFirstName, ShortFirstName.
True, several VOs may require exactly the same validations, e.g. both StreetName and CityName must start with a capital and length cannot exceed 100. Does this mean we have to make great effort to avoid duplications in the name of "reusability"? I would say no, especially if avoiding duplications means having a single VO named CapitalHeadStringUpTo100Characters. Such name conveys no business purpose. Moreover, if CityName suddenly requires additional validations, breaking CapitalHeadStringUpTo100Characters into two VOs may require much work.
Copy of VO: Providing "Type-Safety" for entity arguments?
Inheritance is a tool provided by development platform, it is more than OK to use it, but only to the point where things get messy or too abstract. Remember, VO only expresses a specific domain-approach principle. The polymorphism OOP principle, on the other hand, which of course may be applied in DDD applications, is tightly coupled with abstraction concepts (i.e. base classes), and I would say it should fit better to the entities model part.
BTW, you can find on web several implementations for a base VO class.
Persistence: Reading stored VO
System designs were to be of less importance if the same validations had occurred over and over again in different points of a single use case. Unless you have a reason to believe that your database can be altered by external components, it is sufficient to reconstitute an entity from database without re-validating. Also keep in mind that a typical entity may embed at least one VO, which is the same VO used both in "persistence step" (when entity is being constructed) and in "reading step" (when being reconstituted).
Persistence/Admin: Something like "god" in the system.
Multitenancy applications can be aware of multiple kinds of users. Software does not care if one user is more powerful than another one, it is only subjected to rules. Whether you choose to have a single general entity User with VO FutureDate allowed to be set with null, or two entities User, Admin with VOs FutureDate (not null), FutureDate (nullable) respectively, is less of our interest here. The important thing is that multitenancy can be achieved through smart usage of dependency injection: system identifies user privileges and infers what factories, services or validations are to be injected.
Persistence: Mapping to DB data-types
It really depends on level of maturity in the DDD field. Applications will always have bugs, and you should have some clue on your business's bug-tolerance level in case you choose to design a lenient database.
Aside of that, keep in mind that no matter how much effort you put into it, your database can probably never reflect the full set of business invariants: limiting a single VO to some length is easy, but setting rules involving multiple VOs (that is when one VO's validity depends another VO) is less convenient.

Related

Is there such a thing as "too many value objects" in a class? (I'm implementing DDD)

Note: I'm using PHP, but I think this question would be considered language agnostic. I'm also implementing "lite DDD", but I don't think that restricts the question any either.
I have a class (entity) with a lot of properties that can be set (I also have a lot of behavior on the entity, so it's not an anemic domain model).
For example, I have many string properties that look like the following:
public function setFirstName($firstName)
{
// Removes all non-printable characters and extra spaces.
// Also converts all "look-a-like" quotes to the standard quote character.
$firstName = Helper::cleanName($firstName);
if(empty($firstName)){
throw new \Exception("first name is required");
}
$this->firstName = $firstName;
}
Adhereing to DRY, I see benefit in creating a "clean name" value object. But this will turn almost all of my string properties into these value objects. I also have other properties that are dates that I would turn into Carbon date value objects. In the end, it seems almost every property (that will be persisted) has become a value object. I'm concerned if I'm over-using this approach, or if there are any memory-related problems that could arrise (especially in larger collections of these entities).
How does one go about determining if a class is over-using value objects? Ie. is there a minimum standard to the amount of logic to be encapsulated to be a value object? Am I blowing up memory by making almost every property (that will be persisted) a value object? Has anyone encountered issues with a class with perhaps 30+ value objects?
According to Mathias Verraes' excellent presentation Unbreakable Domain Models, he mentions Value Objects are the "heart and soul" of programming. So I think the ultimate answer is to not be shy when it comes to using Value Objects.
However, after thinking about the type of validation I mention in my question (cleaning characters out of these names, etc.) I have decided I can shift this to a form input handler class. It feels a little better to do some cleaning at the "application level" before getting to the domain. Of course, if I start to have any validation that requires domain logic then I should be doing that at the domain level always. And I think the Carbon dates should stay. I also have a property called "urlSafeName" (or slug) which I'm considering making a Value Object (takes a string and turns it into a more "pretty url" form).
I'm still a bit worried about VO's that hold arrays and which could be substituted for very small tables in a database. For example, a VO for a job title which comes from an array of job titles (President, CTO, Lead Developer, etc.). I'd have to be sure the array is going to stay at a manageable size, that size is of course going to be subjective though. If I have an address VO with a cityId, stateId, and countryId, then I probably want to just store them as Ids and not store all the names of those cities, states, and countries in the VO (but should be in a lookup table in the database).

Modelling calculated fields in Rest API

It is a common practice for Restful resources to support field selectors in the query string. For example, if a resource has fields A,B,C and D but the client is interested only in a subset of fields (say A and B) then the Url might look like
.../resource/1/?fields=A,B // only A and B are 'selected'
Now supposed we add another property to the resource. The thing with this property is that it does not have any physical storage. It is a computed value. Also suppose that this computation is very expensive.
Now obviously, Rest does not care about such things, whether data comes from a file a DB or a fancy algorithm.
But here comes a dilemma: the 'fields' query parameter is always optional. In my case, omitting 'fields' means "bring all the fields" (much like '*' in SQL):
.../resource/1 // A,B,C,D and E(xpensive) are 'selected'
I am positive that there are many existing clients that are using the naive approach (not bothering to specify an explicit list of fields). This means that adding this new heavy property will unintentionally create a performance break (possibly a very severe one).
What are the common techniques to cope with these situations?
Alternatives I considered:
Add a special notion to the system that says that querying with '*' semantics will not necessarily return ALL the fields (heavy fields will be omitted by default). If a client wants them- he must ask for them explicitly
Not to model these extra properties as fields on the resource. Instead expose a dedicated endpoint that will carry out the computation, thus eliminating possible confusion but introducing Rest-RPC style into the system.
Make it the cilent's problem: if he did not bother to be explicit in the first place, tough for him. That is not really an option- don't have this privilege.
I like option 1 best. I'd consider representing these fields as referenced object, and then you could get to them HATEOAS style, through separate calls for the heavy fields. This is similar to how web-pages behave -- return the framework and some content, then force extra calls if the user wants the images, videos, etc. –
Take a look at this: spring.io/understanding/HATEOAS, this: timelessrepo.com/haters-gonna-hateoas, and this: stackoverflow.com/questions/tagged/hateoas

How do I avoid duplicating validation logic between the domain and application layers?

Any given entity in my domain model has several invariants that need be enforced -- a project's name must be at least 5 characters, a certain product must exist to be associated with the project, the due date must not be prior to the current date and time, etc.
Obviously I want the client to be able to display error messages related to validation, but I don't want to constantly maintain the validation rules between several different layers of the program -- for example, in the widget, the controller, the application service or command object, and the domain. Plus, it would seem that a descriptive error message is presentation-related and not belonging to the domain layer. How can I solve these dilemmas?
I would create specific exceptions related to your expected error conditions. This is standard for Exception handling in general and will help with your issue. For example:
public class ProjectNameNotLongEnoughException : System.Exception
or
public class DueDatePriorToCurrentDateException : System.Exception
Mark these possible exceptions in the xml comments for the methods that may throw them so that applications written against your domain model will know to watch out for these exceptions and will be able to present a message within the presentation of the application. This also allows you to have localized error messages based on the culture without cluttering up your domain model with presentation concerns.
If you choose to perform client-side validation, I'm afraid that you can't have your cake and eat it too. In this case, you may have to duplicate validation logic in order to achieve the desired features while maintaining your architecture.
Hope this helps!
I realise this is an old question, but this may help others in a similar situation.
You have here Behavior and Conditions which you need to encapsulate into your domain model.
For example, the ProjectName having a requirement on a certain length I would suggest should be encapsulated within a ValueObject. It may seem overboard for some, but within our Domain Model we almost always encapsulate native types, especially String, within a ValueObject. This then allows you to roll your validation within the constructor of the ValueObject.
Within the Constructor you can throw an Exception relating to the violation of the parameters passed in. Here is an example of one of our ValueObjects for a ZoneName:
public ZoneName(string name)
{
if (String.IsNullOrWhiteSpace(name))
{
throw new ArgumentNullException("Zone Name is required");
}
if (name.Length > 33)
{
throw new ArgumentException("Zone name should be less than 33 characters long");
}
Name = name;
}
Now consumers of that ValueObject can either perform their own validation before calling the constructor, or not, but either way your invariants will be consistent with your model design.
One way we build validation rules within your Domain Model, and then utilise them within your UI is to use the Mediatr module, which uses a One Model In, One Model Out pattern, and allows you to define Validators for each of your Query or Command models. These are defined using FluentValidation. You can then add a Provider to the ModelValidatorProviders within MVC. Take a look at JBogards ContosoUniversity example here https://github.com/jbogard/ContosoUniversity/tree/master/src/ContosoUniversity and look at the DependancyResolution folder, DefaultRegistry.cs.
Your other example of a Product must exist to be linked to a Project. This sounds to me like a Domain Service would be the best option to facilitate the cooperation between 2 bounded contexts? The Domain Service will ensure the invariants remain consistent across the bounded contexts. That Domain Service would not be exposed publically, so you would need an ApplicationService or a CQRS type interface which will take that DomainService as a dependency, allowing the DomainService to perform the operations required. The DomainService should contain the Domain Behavior, whereas the Application Service should just be a facilitator to call that function. Your DomainService would then throw exceptions rather than result in inconsistent or invalid invariants.
You should ultimately end up in a position where you don't have duplicated validation, or at least you never end up with invalid invariants because validation has not been performed at some point, as validation is always handled within your domain model.
While a descriptive error message may seem to pertain to presentation moreso than business, the descriptive error message actually embodies a business rule contained within the domain model -- and when throwing an exception of any kind, it is best practice to pass along some descriptive message. This message can be re-thrown up the layers to ultimately display to the user.
Now, when it comes to preemptive validation (such as a widget allowing the user to only type certain characters or select from a certain range of options) an entity might contain some constants or methods that return a dynamically-produced regular expression which may be utilized by a view model and in turn implemented by the widget.

Validation on domain entities along with MVP

How do you apply validation in an MVP/domain environment ?
Let me clearify with an example:
Domain entity:
class Customer
{
string Name;
etc.
}
MVP-model
class CustomerModel
{
string Name;
etc.
}
I want to apply validation on my domain entities but the MVP model has it's own model/class
apart from the domain entity, does that mean I have to copy the validation code
to also work on the MVP-model?
One solution I came up with is to drop the MVP-model and use the domain entity as MVP-Model,
but I don't want to set data to the entities that isn't validated yet.
And second problem that rises is that if the entity has notify-events,
other parts of the application will be affected with faulty data.
A third thing with that approach is if the user edits some data and then cancels the edit, how do I revert to the old values ? (The entity might not come from a DB so reloading the entity is't possible in all cases).
Another solution is to make some sort of copy/clone of the entity in question and use the copy as MVP-model, but then it might get troublesome if the entity has a large object graph.
Anyone has some tips about these problems?
Constraining something like the name of a person probably does not rightfully belong in the domain model, unless in the client's company there is actually a rule that they don't do business with customers whose names exceed 96 characters.
String length and the like are not concerns of the domain -- two different applications employing the same model could have different requirements, depending on the UI, persistence constraints, and use cases.
On the one hand, you want to be sure that your model of a person is complete and accurate, but consider the "real world" person you are modeling. There are no rules about length and no logical corollary to "oops, there was a problem trying to give this person a name." A person just has a name, so I'd argue that it is the responsibility of the presenter to validate what the user enters before populating the domain model, because the format of the data is a concern of the application moreso than the domain.
Furthermore, as Udi Dahan explains in his article, Employing the Domain Model Pattern, we use the domain model pattern to encapsulate rules that are subject to change. That a person should not a have a null name is not a requirement that is likely ever to change.
I might consider using Debug.Assert() in the domain entity just for an added layer of protection through integration and/or manual testing, if I was really concerned about a null name sneaking in, but something like length, again, doesn't belong there.
Don't use your domain entities directly -- keep that presentation layer; you're going to need it. You laid out three very real problems with using entities directly (I think Udi Dahan's article touches on this as well).
Your domain model should not acquiesce to the needs of the application, and soon enough your UI is going to need an event or collection filter that you're just going to have to stick into that entity. Let the presentation layer serve as the adapter instead and each layer will be able to maintain its integrity.
Let me be clear that the domain model does not have to be devoid of validation, but the validation that it contains should be domain-specific. For example, when attempting to give someone a pay raise, there may be a requirement that no raise can be awarded within 6 months of the last one so you'd need to validate the effective date of the raise. This is a business rule, is subject to change, and absolutely belongs in the domain model.

Fat Domain Models => Inefficient?

Looking at DDD, we abstract the database into the various models which we operate on and look at it as a repository where our models live. Then we add the Data Layers and the Service/Business layers on top of it. My question is, in doing so, are we creating inefficiencies in data transfer by building fat models?
For example, say we have system that displays an invoice for a customer on the screen.
Thinking of it in terms of OOP, we'd probably end up with an object that looks somewhat like this:
class Invoice {
Customer _customer;
OrderItems _orderitems;
ShippingInfo _shippingInfo;
}
class Customer {
string name;
int customerID;
Address customerAddress;
AccountingInfo accountingInfo;
ShoppingHistory customerHistory;
}
(for the sake of the question/argument,
let's say it was determined that the customer class had to
implement AccountingInfo and ShoppingHistory)
If the invoice solely needs to print the customer name, why would we carry all the other baggage with it? Using the repository type of approach seems like we would be building these complex domain objects which require all these resources (CPU, memory, complex query joins, etc) AND then transferring it over the tubes to the client.
Simply adding a customerName property to the invoice class would be breaking away from abstractions and seems like a horrible practice. On the third hand, half filling an object like the Customer seems like a very bad idea as you could end up creating multiple versions of the same object (e.g. one that has a an Address, but no ShoppingHistory, and one that has AccountingInfo but no Address, etc, etc). What am I missing, or not understanding?
As good object relational mappers can lazy load the relationships, you would therefore pull back the customer for your invoice, but ignore their accounting and shopping history. You could roll your own if you're not using an oject-relational mapper.
Often you can't do this within your client as you'll have crossed your trasaction boundary (ended your database trasaction) and so it is up to your service layer to ensure the right data has been loaded.
Testing the right data is available (and not too much of it) is often good to do in unit tests on a service layer.
You say "it was determined that the customer class had to implement AccountingInfo and ShoppingHistory", so clearly displaying an invoice is NOT the only task that the system performs (how else was it "determined" that customers need those other functionalities otherwise?-).
So you need a table of customers anyway (for those other functionalities) -- and of course your invoice printer needs to get customer data (even just the name) from that one table, the same one that's used by other functionality in the system.
So the "overhead" is purely illusory -- it appears to exist when you look at one functionality in isolation, but doesn't exist at all when you look at the whole system as an integrated whole.

Resources