Is there such a thing as "too many value objects" in a class? (I'm implementing DDD) - memory-management

Note: I'm using PHP, but I think this question would be considered language agnostic. I'm also implementing "lite DDD", but I don't think that restricts the question any either.
I have a class (entity) with a lot of properties that can be set (I also have a lot of behavior on the entity, so it's not an anemic domain model).
For example, I have many string properties that look like the following:
public function setFirstName($firstName)
{
// Removes all non-printable characters and extra spaces.
// Also converts all "look-a-like" quotes to the standard quote character.
$firstName = Helper::cleanName($firstName);
if(empty($firstName)){
throw new \Exception("first name is required");
}
$this->firstName = $firstName;
}
Adhereing to DRY, I see benefit in creating a "clean name" value object. But this will turn almost all of my string properties into these value objects. I also have other properties that are dates that I would turn into Carbon date value objects. In the end, it seems almost every property (that will be persisted) has become a value object. I'm concerned if I'm over-using this approach, or if there are any memory-related problems that could arrise (especially in larger collections of these entities).
How does one go about determining if a class is over-using value objects? Ie. is there a minimum standard to the amount of logic to be encapsulated to be a value object? Am I blowing up memory by making almost every property (that will be persisted) a value object? Has anyone encountered issues with a class with perhaps 30+ value objects?

According to Mathias Verraes' excellent presentation Unbreakable Domain Models, he mentions Value Objects are the "heart and soul" of programming. So I think the ultimate answer is to not be shy when it comes to using Value Objects.
However, after thinking about the type of validation I mention in my question (cleaning characters out of these names, etc.) I have decided I can shift this to a form input handler class. It feels a little better to do some cleaning at the "application level" before getting to the domain. Of course, if I start to have any validation that requires domain logic then I should be doing that at the domain level always. And I think the Carbon dates should stay. I also have a property called "urlSafeName" (or slug) which I'm considering making a Value Object (takes a string and turns it into a more "pretty url" form).
I'm still a bit worried about VO's that hold arrays and which could be substituted for very small tables in a database. For example, a VO for a job title which comes from an array of job titles (President, CTO, Lead Developer, etc.). I'd have to be sure the array is going to stay at a manageable size, that size is of course going to be subjective though. If I have an address VO with a cityId, stateId, and countryId, then I probably want to just store them as Ids and not store all the names of those cities, states, and countries in the VO (but should be in a lookup table in the database).

Related

DDD - Life cycle of Value Objects: Validation & Persistence

I understand what is VO (immutable, no-identity, ...). But I have several questions that are from discussion with my co-workers.
A) Validation - How precise should it be?
What types of validation should I put into VO? Only basic ones or all? Good example of VO is Email with some regexp validation. I've seen it many times. I've worked on several big/medium-size applications and regexp wasn't good enough because:
system-A: Domain name of email was validated, eg test#gmali.com in invalid email because domain gmali.com doesn't exist
system-B: We had list (service) of banned domains of "temporary email services" because we wanted to avoid of "fake accounts"
I cannot imagine to put validation of this kind into VO, because it require network communication and VO will be complicated (and slow).
B) Validation: Names, Titles, Strings... is length part of VO?
Programmer can use old-good string data type. I can image VO as NotEmptyString, but is it good approach to create value objects as:
FirstName (non-empty string with length limitation)
Surname (non-empty string with length limitation)
StreetName(non-empty string with length limitation)
There is no difference between FirstName and Surname, because in application we cannot find out if some one swap first name and surname in form. Robert can be first name and it can be also surname...
class Person
{
private string $firstName; // no VO for firstName
// or
private FirstName $firstName; // VO just for firstName & length validation
// or
private NotEmptyString $firstName; // VO - no max length validation
// or
private StringLength50 $firstName; // same as FirstName, different name for sharing
}
Which approach is the best and why?
C) Copy of VO: Providing "Type-Safety" for entity arguments?
This point is similar to previous one.
Is it good practice to create classes like this:
class Surname extends Name
{
}
class FirstName extends Name
{
}
just "to alias" VO?
D) Persistence: Reading stored VO
This point is closely related to first one: A) Validation - How precise should it be?. I strongly believe what is stored in my "storage engine" (DB) is valid - no questions. I don't see any reason why I should validate VO again when everything was validated during "persistence step". Even complex/poorly-written regexp could be performance killer - listing of N-hundreds on user emails.
I'm lost here... should I validate only basic stuff and use same VO during persist and read or should I have 2 separate VO for these cases?
E) Persistence/Admin: Something like "god" in the system.
From my experience: In real-word system user with higher privileges can sometimes by-pass validation rules and this is again related to point A) Example:
you (as regular user of system) can make online reservation up to 30 days from today
admin user can make online reservation to any date
Should I use only Date / FutureDate VO or what?
F) Persistence: Mapping to DB data-types
Is it good practice to closely bound VO and DB (storage engine) data types?
If FirstName can have only 50 chars should it be defined / mapped to VAR_CHAR(50)?
Thanks.
A) Validation - How precise should it be?
It's not about precision, it's about invariants & responsibility. A value object (VO) can't possibly have authority on whether or not an email address exists. That's a fact that varies and can't be controlled by the VO. Even if you had code such as the following:
var emailAddress = EmailAddress.of('some#email.com', emailValidityChecker);
The address may not exist a few minutes later, the user may have lost his account password forever, etc.
So what does EmailAddress should represent? It should ensure the "format" of the address makes it a usable & useful address in your domain.
For instance, in a system responsible for delivering tax reminders, I had a limitation where I had to use Exchange and it couldn't support certain email formats like addresses with "leading, trailing or consecutive dots in the local-part" (took the exact comment I had put).
Even though that's a technical concern in theory, that means our system couldn't ingest such email addresses and they were completely useless to us so the ValidEmailAddress VO did not accept those to fail early (otherwise it was generating false positives down the chain).
B) Validation: Names, Titles, Strings... is length part of VO?
I would, even though such lengths might sometimes feel somewhat arbitrary or infrastructure-driven. However, I think it's safe to say that a name with 500 characters is certainly a mistake. Furthermore, validating with reasonable ranges can protect against attacks (e.g. a 1GB name???). Some may argue that it's purely an infrastructure concern and would put the validation at another layer, but I disagree and I think the distinction is unhelpful.
The length rules aren't always arbitrary, for instance a TweetMessage that can't be longer than 280 chars, that's a domain rule.
Does that mean you must have a VO for every possible strings in the system? Honestly I pushed backed being scared to overuse VOs and edge towards a VO-obsession rather than primitive obsession, but in almost every scenario I wished I just took the time to wrap that damn string.
Be pragmatic, but I see more harm in underusing than overusing VOs.
C) Copy of VO: Providing "Type-Safety" for entity arguments?
I most likely wouldn't extend Name just for the sake of reuse here. There's most likely no place where you'd want to interchange a Surename with a FirstName so polymorphism is pretty useless too. However, the explicit types may help to interchange "surename" for "first name" and vice-versa.
Independently of whether or not the explicit types are useful, something more useful here might be to aggregate both under a FullName VO that creates increases cohesion.
Please beware that overly restrictive name policies has been a huge pain point for many international systems though...
D) Persistence: Reading stored VO
Persisted data lives on the "safe" side and should NOT be validated again when loaded into memory. You should be able to circumvent the validation path when hydrating our VOs.
E) Persistence/Admin: Something like "god" in the system.
VOs are great to enforce their "invariants". An invariant by definition doesn't vary given the context. That's actually something many misunderstood when saying "always-valid" approach doesn't work.
That said, even system admins most likely can't make new reservations in the past, so perhaps that can be an invariant of a ReservationDate. Ultimately you would most likely extract the other rules in the context to which they belong.
F) Persistence: Mapping to DB data-types
I think it's more important to reflect the DB limitation in the domain than inversely, reflect the domain limitation in the DB. If your DB only accepts 50 chars and you exceed that some systems will just crash with a very cryptic error message not even telling you which column overflowed. Validating in the domain would help debugging much more quickly. However, you do not necessarily have to match the domain rule in the DB.
DDD, like any other design, is a matter of drawing lines and making abstract rules. Some rules may be very strict, while others may be fluent to some extent. The important thing is to keep consistency as much as possible, rather than striving to build the ultimate-undefeatable domain.
Validation - How precise should it be?
"Heavy" validations should not occur inside VO. A VO is not very
different in its nature from the primitive it encapsulates, therefore
validations should be independent of external factors. Please recall that
even primitives such as byte may be internally validated: an exception (sometimes even a compile error) occurs when a byte variable is assigned with value greater than 255.
Advanced validations, on the other hand, belong to the flow part (use-case / interactor / command-handler), since they involve operations beyond the scope of the VO's primitive, such as querying databases or invoking APIs. You can, for example, query a list of valid email providers from database, check if VO's provider contained in list, and throw exception if not. This is simply flow.
You may, of course, decide to have an in-memory static list of email providers, in which case it will be perfectly valid to keep it inside VO and check its primitive against that list. No need to communicate with external world, everything is "local". Would it be scalable? probably not. But it follows a DDD rule stating that VO should not "speak" with external resources.
Validation: Names, Titles, Strings... is length part of VO?
VOs, much like other DDD concepts, should "speak out loud" your business domain, meaning that they should express business semantics. This is why FirstName, Surname and StreetName are good names, while NotEmptyString is less preferable due to the fact it communicates technical rather than business details.
If, for example, your business states that customers with a more-than-50-characters-length name are to be addressed differently than customers with a less-than-50-characters-length name, then you probably should have two VOs, e.g. LongFirstName, ShortFirstName.
True, several VOs may require exactly the same validations, e.g. both StreetName and CityName must start with a capital and length cannot exceed 100. Does this mean we have to make great effort to avoid duplications in the name of "reusability"? I would say no, especially if avoiding duplications means having a single VO named CapitalHeadStringUpTo100Characters. Such name conveys no business purpose. Moreover, if CityName suddenly requires additional validations, breaking CapitalHeadStringUpTo100Characters into two VOs may require much work.
Copy of VO: Providing "Type-Safety" for entity arguments?
Inheritance is a tool provided by development platform, it is more than OK to use it, but only to the point where things get messy or too abstract. Remember, VO only expresses a specific domain-approach principle. The polymorphism OOP principle, on the other hand, which of course may be applied in DDD applications, is tightly coupled with abstraction concepts (i.e. base classes), and I would say it should fit better to the entities model part.
BTW, you can find on web several implementations for a base VO class.
Persistence: Reading stored VO
System designs were to be of less importance if the same validations had occurred over and over again in different points of a single use case. Unless you have a reason to believe that your database can be altered by external components, it is sufficient to reconstitute an entity from database without re-validating. Also keep in mind that a typical entity may embed at least one VO, which is the same VO used both in "persistence step" (when entity is being constructed) and in "reading step" (when being reconstituted).
Persistence/Admin: Something like "god" in the system.
Multitenancy applications can be aware of multiple kinds of users. Software does not care if one user is more powerful than another one, it is only subjected to rules. Whether you choose to have a single general entity User with VO FutureDate allowed to be set with null, or two entities User, Admin with VOs FutureDate (not null), FutureDate (nullable) respectively, is less of our interest here. The important thing is that multitenancy can be achieved through smart usage of dependency injection: system identifies user privileges and infers what factories, services or validations are to be injected.
Persistence: Mapping to DB data-types
It really depends on level of maturity in the DDD field. Applications will always have bugs, and you should have some clue on your business's bug-tolerance level in case you choose to design a lenient database.
Aside of that, keep in mind that no matter how much effort you put into it, your database can probably never reflect the full set of business invariants: limiting a single VO to some length is easy, but setting rules involving multiple VOs (that is when one VO's validity depends another VO) is less convenient.

What is a "fully-hydrated User object" in the context of GraphQL? [duplicate]

When someone talks about hydrating an object, what does that mean?
I see a Java project called Hydrate on the web that transforms data between different representations (RDMS to OOPS to XML). Is this the general meaning of object hydration; to transform data between representations? Could it mean reconstructing an object hierarchy from a stored representation?
Hydration refers to the process of filling an object with data. An object which has not yet been hydrated has been instantiated and represents an entity that does have data, but the data has not yet been loaded into the object. This is something that is done for performance reasons.
Additionally, the term hydration is used when discussing plans for loading data from databases or other data sources. Here are some examples:
You could say that an object is partially hydrated when you have only loaded some of the fields into it, but not all of them. This can be done because those other fields are not necessary for your current operations. So there's no reason to waste bandwidth and CPU cycles loading, transferring, and setting this data when it's not going to be used.
Additionally, there are some ORM's, such as Doctrine, which do not hydrate objects when they are instantiated, but only when the data is accessed in that object. This is one method that helps to not load data which is not going to be used.
With respect to the more generic term hydrate
Hydrating an object is taking an object that exists in memory, that doesn't yet contain any domain data ("real" data), and then populating it with domain data (such as from a database, from the network, or from a file system).
From Erick Robertson's comments on this answer:
deserialization == instantiation + hydration
If you don't need to worry about blistering performance, and you aren't debugging performance optimizations that are in the internals of a data access API, then you probably don't need to deal with hydration explicitly. You would typically use deserialization instead so you can write less code. Some data access APIs don't give you this option, and in those cases you'd also have to explicitly call the hydration step yourself.
For a bit more detail on the concept of Hydration, see Erick Robertson's answer on this same question.
With respect to the Java project called hydrate
You asked about this framework specifically, so I looked into it.
As best as I can tell, I don't think this project used the word "hydrate" in a very generic sense. I see its use in the title as an approximate synonym for "serialization". As explained above, this usage isn't entirely accurate:
See: http://en.wikipedia.org/wiki/Serialization
translating data structures or object state into a format that can be stored [...] and reconstructed later in the same or another computer environment.
I can't find the reason behind their name directly on the Hydrate FAQ, but I got clues to their intention. I think they picked the name "Hydrate" because the purpose of the library is similar to the popular sound-alike Hibernate framework, but it was designed with the exact opposite workflow in mind.
Most ORMs, Hibernate included, take an in-memory object-model oriented approach, with the database taking second consideration. The Hydrate library instead takes a database-schema oriented approach, preserving your relational data structures and letting your program work on top of them more cleanly.
Metaphorically speaking, still with respect to this library's name: Hydrate is like "making something ready to use" (like re-hydrating Dried Foods). It is a metaphorical opposite of Hibernate, which is more like "putting something away for the winter" (like Animal Hibernation).
The decision to name the library Hydrate, as far as I can tell, was not concerned with the generic computer programming term "hydrate".
When using the generic computer programming term "hydrate", performance optimizations are usually the motivation (or debugging existing optimizations). Even if the library supports granular control over when and how objects are populated with data, the timing and performance don't seem to be the primary motivation for the name or the library's functionality. The library seems more concerned with enabling end-to-end mapping and schema-preservation.
While it is somewhat redundant vernacular as Merlyn mentioned, in my experience it refers only to filling/populating an object, not instantiating/creating it, so it is a useful word when you need to be precise.
This is a pretty old question, but it seems that there is still confusion over the meaning of the following terms. Hopefully, this will disambiguate.
Hydrate
When you see descriptions that say things like, "an object that is waiting for data, is waiting to be hydrated", that's confusing and misleading. Objects don't wait for things, and hydration is just the act of filling an object with data.
Using JavaScript as the example:
const obj = {}; // empty object
const data = { foo: true, bar: true, baz: true };
// Hydrate "obj" with "data"
Object.assign(obj, data);
console.log(obj.foo); // true
console.log(obj.bar); // true
console.log(obj.baz); // true
Anything that adds values to obj is "hydrating" it. I'm just using Object.assign() in this example.
Since the terms "serialize" and "deserialize" were also mentioned in other answers, here are examples to help disambiguate the meaning of those concepts from hydration:
Serialize
console.log(JSON.stringify({ foo: true, bar: true, baz: true }));
Deserialize
console.log(JSON.parse('{"foo":true,"bar":true,"baz":true}'));
In PHP, you can create a new class from its name, w/o invoke constructor, like this:
require "A.php";
$className = "A";
$class = new \ReflectionClass($className);
$instance = $class->newInstanceWithoutConstructor();
Then, you can hydrate invoking setters (or public attributes)

Best practice with coding system values

I think this should be an easy one, but haven't found any clear answer, on what would the best practice be.
In an application, we keep current status of an order (open, canceled, shipped, closed ...).
This variables cannot change without code change, but application should meet the following criteria:
status names should be easily displayed in different languages,
application can search via freetext status names (like googling for "open")
status_id should be available to developer via enum
zero headache when adding new statuses
Possible ways we have tackled this so far:
having DB table status with PK(id, language_id) and a separate enum which represents this statuses in an application.
PROS: 1.,2.,3. work out of the box, CONS: 4. needs to run update script on every client installation, SQL selects can become large and cumbersome, when dealing with a lot of code tables
having just enum:
PROS: 3.,4. CONS: 1.,2. is a total nightmare
having enums, which populate database tables on each start of an application:
PROS: 1.,2.,3.,4. work CONS: some overhead on application start, SQL select can become large and cumbersome, when dealing a lot code tables.
What is the most common way of tackling this problem?
Sounds like you summarized it pretty good yourself, and comparing the pros/cons points towards #3. Just one comment when you implement #3 though:
Use a caching mechanism (even a simple HashMap!) plus adding the option to refresh the cache - will ease your work when you'll want to change values (without the need to restart every time!).
I would, and do, use method 3 because it is the best of the lot. You can use resource files to store the translations in and map the enum values to keys in the resource files. Your database can contain the id of the enum for the status.
1.status names should be easily displayed in different languages,
2.application can search via freetext status names (like googling for "open")
These are interfaces layer's concern, you'd better not mix them in you domain model.
I would setup a mapping between status enum and i18n codes. the mapping could be stored in a file (cached in memory) or hardcoded.
for example: if you use dto or view adatper to render your ui.
public class OrderDetailViewAdapter {
private Order order;
public String getStatus() {
return i18nMapper.to(order.getStatus());//use hardcoded switch case or file impl
}
}
Or you could done this before you populating you dtos.
You could use a similar solution for goal2. When user types text, find corresponding enum from mapping and use enum for search.
Anyway, use db tables the less the better.
Personally, I always use dedicated enum class inside domain. Only responsibility of this class is holding status name (OPEN, CANCELED, SHIPPED, ...). Status name is not visible outside codebase. Also, status could be also stored inside database field as string (varchar or similar).
For the purpose of rendering, depending of number of use cases, sometimes I implement formatting inside formatter (e.g. OrderFormatter::formatStatusName(), OrderFormatter::formatAbbreviatedStatusName(), ...). If formatting is needed often I create dedicated class with all formatting styles needed (OrderStatusFormatter::short(), OrderStatusFormatter::abbriviated()...). Of course, internal mapping is needed to map status name to status title, and this is tricky part. But if you want layering you can't avoid mapping.
Translation is not dealt so far. I translate strings inside templates so formatters are clean of that responsibility. To summarize:
enum inside domain model
formatter inside presentation layer
translation inside template
There is no need to create special table for order status translations. Better choice would be to implement generic translation mechanism, seperated from your business code.

Philosophy Object/Properties Parameter Query

I'm looking at some code I've written and thinking "should I be passing that object into the method or just some of its properties?".
Let me explain:
This object has about 15 properties - user inputs. I then have about 10 methods that use upto 5 of these inputs. Now, the interface looks a lot cleaner, if each method has 1 parameter - the "user inputs object". But each method does not need all of these properties. I could just pass the properties that each method needs.
The fact I'm asking this question indicates I accept I may be doing things wrong.
Discuss......:)
EDIT: To add calrity:
From a web page a user enters details about their house and garden. Number of doors, number of rooms and other properties of this nature (15 in total).
These details are stored on a "HouseDetails" object as simple integer properties.
An instance of "HouseDetails" is passed into "HouseRequirementsCalculator". This class has 10 private methods like "calculate area of carpet", "caclulateExtensionPotential" etc.
For an example of my query, let's use "CalculateAreaOfCarpet" method.
should I pass the "HouseDetails" object
or should I pass "HouseDetails.MainRoomArea, HouseDetails.KitchenArea, HouseDetails.BathroomArea" etc
Based on my answer above and related to your edit:
a) You should pass the "HouseDetails"
object
Other thoughts:
Thinking more about your question and especially the added detail i'm left wondering why you would not just include those calculation methods as part of your HouseDetails object. After all, they are calculations that are specific to that object only. Why create an interface and another class to manage the calculations separately?
Older text:
Each method should and will know what part of the passed-in object it needs to reference to get its job done. You don't/shouldn't need to enforce this knowledge by creating fine-grained overloads in your interface. The passed-in object is your model and your contract.
Also, imagine how much code will be affected if you add and remove a property from this object. Keep it simple.
Passing individual properties - and different in each case - seems pretty messy. I'd rather pass whole objects.
Mind that you gave not enough insight into your situation. Perhaps try to describe the actual usage of this things? What is this object with 15 properties?, are those "10 methods that use upto 5 of these input" on the same object, or some other one?
After the question been edited
I should definitely go with passing the whole object and do the necessary calculations in the Calculator class.
On the other hand you may find Domain Driven Design an attractive alternative (http://en.wikipedia.org/wiki/Domain-driven_design). With regard to that principles you could add methods from calculator to the HouseDetails class. Domain Driven Design is quite nice style of writing apps, just depends how clean this way is for you.

Validation on domain entities along with MVP

How do you apply validation in an MVP/domain environment ?
Let me clearify with an example:
Domain entity:
class Customer
{
string Name;
etc.
}
MVP-model
class CustomerModel
{
string Name;
etc.
}
I want to apply validation on my domain entities but the MVP model has it's own model/class
apart from the domain entity, does that mean I have to copy the validation code
to also work on the MVP-model?
One solution I came up with is to drop the MVP-model and use the domain entity as MVP-Model,
but I don't want to set data to the entities that isn't validated yet.
And second problem that rises is that if the entity has notify-events,
other parts of the application will be affected with faulty data.
A third thing with that approach is if the user edits some data and then cancels the edit, how do I revert to the old values ? (The entity might not come from a DB so reloading the entity is't possible in all cases).
Another solution is to make some sort of copy/clone of the entity in question and use the copy as MVP-model, but then it might get troublesome if the entity has a large object graph.
Anyone has some tips about these problems?
Constraining something like the name of a person probably does not rightfully belong in the domain model, unless in the client's company there is actually a rule that they don't do business with customers whose names exceed 96 characters.
String length and the like are not concerns of the domain -- two different applications employing the same model could have different requirements, depending on the UI, persistence constraints, and use cases.
On the one hand, you want to be sure that your model of a person is complete and accurate, but consider the "real world" person you are modeling. There are no rules about length and no logical corollary to "oops, there was a problem trying to give this person a name." A person just has a name, so I'd argue that it is the responsibility of the presenter to validate what the user enters before populating the domain model, because the format of the data is a concern of the application moreso than the domain.
Furthermore, as Udi Dahan explains in his article, Employing the Domain Model Pattern, we use the domain model pattern to encapsulate rules that are subject to change. That a person should not a have a null name is not a requirement that is likely ever to change.
I might consider using Debug.Assert() in the domain entity just for an added layer of protection through integration and/or manual testing, if I was really concerned about a null name sneaking in, but something like length, again, doesn't belong there.
Don't use your domain entities directly -- keep that presentation layer; you're going to need it. You laid out three very real problems with using entities directly (I think Udi Dahan's article touches on this as well).
Your domain model should not acquiesce to the needs of the application, and soon enough your UI is going to need an event or collection filter that you're just going to have to stick into that entity. Let the presentation layer serve as the adapter instead and each layer will be able to maintain its integrity.
Let me be clear that the domain model does not have to be devoid of validation, but the validation that it contains should be domain-specific. For example, when attempting to give someone a pay raise, there may be a requirement that no raise can be awarded within 6 months of the last one so you'd need to validate the effective date of the raise. This is a business rule, is subject to change, and absolutely belongs in the domain model.

Resources