Do I need to use state pattern for data approval process? - asp.net-mvc-3

Users of our system are able to submit un-validated contact data. For example:
Forename: null
Surname: 231
TelephoneNumber: not sure
etc
This data is stored in a PendingContacts table.
I have another table - ApprovedContacts. This table has a variety of constraints to improve consistency and integrity. This table shouldn't contain any dirty or incomplete data.
I need a process to move data from one table to another. Structure of both tables is nearly identical, however, one table has the constraints, when another doesn't.
I have two states: Pending and Approved, gut feeling tells me that I should use a state pattern details here. In theory this should allow me to change contact's state from Pending to Approved, depending on whether the contact has been successfully validated. Problem is that I don't see how is this going to work.
Am I going in a right direction or should I be looking at something completely different?
Presentation layer is in MVC 3, so I have view models for pending contacts and approved contacts, as well as domain models for pending contacts and approved contacts. My view models are generally DTOs with some validation routines, but now my view models represent state too. This doesn't seem right.
For example, all contacts must have a state and they can be saved and removed , so I need an interface for that:
public interface IContactViewModelState
{
void Save(ContactViewModel item);
}
I then add an implementation for saving pending contacts into the PendingContacts table:
public class PendingContactViewModelState: IContactViewModelState
{
public void Save(ContactViewModel item)
{
// Save to the pending contacts table
// I don't like this as my view model now references data access layer
}
}

Short answer: no, because you only have two states. You'd use a state pattern to help deal with complex situations with many states and rules. The only reason you might want to go with a full-blown state pattern based implementation is if you there's a very high chance such a situation is imminent.
If the result of a success transition to Approved is the record ending up in the approved table then you really just need to decide where you want to enforce the constraints. This decision will/can be based on many factors including the likely frequency of change (to the constraints) and where other logic resides.
A lot of patterns (but not all) tend to deal with how to structure an application, but here I think it's just a case of deciding where and how implementing some logic. In other words - you might just be accidentally over-analyzing the problem - it's easily done :)

Related

DM and hierarchies - dimensions for future use

My very first DM so be gentle..
Modeling a hierarchy with ERD as follows:
Responses are my facts. All the advice I've seen indicates creating a single dimension (say dim_event) and denormalizing event, department and organization into that dimension:
What if I KNOW that there will be future facts/reports that rely on an Organization dimension, or a Department dimension that do not involve this particular fact?
It makes more sense to me (from the OLTP world) to create individual dimensions for the major components and attach them to the fact. That way they could be reused as conformed dimensions.
This way for any updating dimension attributes there would be one dim table; if I had everything denormalized I could have org name in several dimension tables.
--Update--
As requested:
An "event" is an email campaign designed to gather response data from a specific subset of clients. They log in and we ask them a series of questions and score the answers.
The "response" is the set of scores we generate from the event.
So an "event" record may look like this:
name: '2019 test Event'
department: 'finance'
"response" records look something like this:
event: '2019 test Event'
retScore: 2190
balScore: 19.98
If your organization and department are tightly coupled (i.e. department implies organization as well), they should be denormalized and created as a single dimension. If department & organization do not have a hierarchical relationship, they would be separate dimensions.
Your Event would likely be a dim (degenerate) and a fact. The fact would point to the various dimensions that describe the Event and would contain the measures about what happened at the Event (retScore, balScore).
A good way to identify if you're dealing with a dim or a fact is to ask "What do I know before any thing happens?" I expect you'd know which orgs & depts are available. You may even know certain types of recurring events (blood drive, annual fundraiser), which could also be a separate dimension (event type). But you wouldn't have any details about a specific event, HR Fundraiser 2019 (fact), until one is scheduled.
A dimension represents the possibilities, but a fact record indicates something actually happens. My favorite analogy for this is a restaurant menu vs a restaurant order. The items on the menu can be referenced even if they've never been ordered. The menu is the dimension, the order is the fact.
Hope this helps.

Does event store store state?

Theoretically when using event sourcing you don't store "state" but events. But I saw in many implementations that you store snapshots of state in a column in a format like JSON or just a BLOB. For example:
Using an RDBMS as event sourcing storage
The events table has Data column which stores entire object. To me, it's like storing state for that given time when some event occurred.
Also this picture(taken from Streamstone):
It has Data column with a serialized state. So it stores state too but inside an Event?
So how to replay from the initial state then, If I can simply pick some event and access Data to get the state directly.
What exactly is stored inside Data, is it a state of the entire object or it's serialized event?
Let's say I have a person object (in C#)
public class Person
{
public string Name { get; set }
public int Age { get; set; }
}
What should my event store when I create a person or change properties like name or age.
When I create a Person I most likely will send something like PersonCreatedEvent with the initial state, so the entire object.
But what if I change Name or Age should they be 2 separate events or just 1? PersonChangedEvent or PersonChangedAgeEvent and PersonChangedNameEvent?
What should be stored in the event in this case?
What exactly is stored inside Data, is it a state of the entire object or it's serialized event?
That will usually be a serialized representation of the event.
One way of thinking about it: a stream of events is analogous to a stream of patch documents. Current state is calculated by starting from some known default state and then applying each patch in turn -- aka a "fold". Previous states can be recovered by choosing a point somewhere in the stream, and applying the patches up to that point.
The semantics of events, unlike patches, tends to be domain specific. So Checked-In, Checked-Out rather than Patched, Patched.
We normally keep the events compact - you won't normally record state that hasn't changed.
Within your domain specific event language, the semantics of a field in an event may be "replace" -- for example, when recording a change of a Name, we probably just store the entire new name. In other cases, it may make sense to "aggregate" instead -- with something like an account balance, you might record credits and debits leaving the balance to be derived, or you might update the total balance (like a gauge).
In most mature domains (banking, accounting), the domain language has semantics for recording changes, rather than levels. We write new entries into the ledger, we write new entries into the checkbook register, we read completed and pending transactions in our account statement.
But what if I change Name or Age should they be 2 separate events or just 1? PersonChangedEvent or PersonChangedAgeEvent and PersonChangedNameEvent?
It depends.
There's nothing wrong with having more than one event produced by a transaction.
There's nothing wrong with having a single event schema, that can be re-used in a number of ways.
There's nothing wrong with having more than one kind of event that changes the same field(s). NameLegallyChanged and SpellingErrorCorrected may be an interesting distinction to the business.
Many of the same concerns that motivate task based UIs apply to the design of your event schema.
It still seems to me like PersonChangedEvent will contain all person properties that can change. Even if they didn't change
In messaging (and event design takes a lot of lessons from message design), we often design our schema with optional fields. So the event schema can be super flexible, while any individual representation of an event would be compact (limited to only the information of interest).
To answer your question, an event that is stored should be the event data only. Not the objects state.
When you need to work on your Entity, you read up all the events and apply them to get the latest state every time. So events should be stored with the events data only. (ofc together with AggregateId, Version etc)
The "Objects State" will be the computation of all events, but if you have an Eventlistener that listens to all your published events you can populates a separate ReadModel for you. To query against and use as read only from a users perspective.
Hope it helps!
Updated answer to updated question:
Really depends on your model, if you do the update at the same time Age and Name, yes the new age and name values should be stored in a new event.
The event should only contain this data "name and age with aggregateId, version etc"
The event listener will listen specifically on each event (created, updated etc), find the aggregates read model that you have stored and only update these 2 properties (in this example).
For createevent you create the object for the read model.

Event Sourcing - Aggregate modeling

Two questions
1) How to model aggregate and reference between them
2) How to organise/store events so that they can be retrieved efficiently
Take this typical use case as example, we have Order and LineItem (they are an aggregate, Order is the aggregate root), and Product aggregate.
As LineItem needs to know which Product, so there are two options 1) LineItem has direct reference to Product aggregate (which seems not a best practice, as it violate the idea of aggregate being a consistence boundary because we can update Product aggregate directly from Order aggregate) 2) then LineItem only has ProductId.
It looks like 2nd option is the way to go...What do you think here?
However, another problem arises which is about building a Order read/view model. In this Order view model, it needs to know which Products are in Order (i.e. ProductId, Type, etc.). The typical use case is reporting, and CommandHandler also can use this Product object to perform logic such as whether there are too many particular products, etc. In order to do it, given the fact that those data are in two separate aggregate, then we need 1+ database roundtrips. As we are using events to build model, so the pseudo code looks like below
1) for a given order id (guid, order aggregate id), we load all the events for it; -- 1st database access
2) then build a Order aggregate, then we know which ProductId are referenced in Order;
3) for the list of ProductIds, we load all events for it; -- 2nd database access
If we build a really big graph of objects (a lot of different aggregates), then this may end up with a few more database access (each of which is slow)...What's your idea in here?
Thanks
Take this typical use case as example, we have Order and LineItem (they are an aggregate, Order is the aggregate root), and Product aggregate.
The Order aggregate makes sense the way you have described it. "Product aggregate" is more suspicious; do you ask the model if the product is allowed to change, or are you telling the model that the product has changed?
If Product can change without first consulting with the order, then the LineItem must not include the product. A reference to the product (aka the ProductId) is ok.
If we build a really big graph of objects (a lot of different aggregates), then this may end up with a few more database access (each of which is slow)...What's your idea in here?
For reads, reports, and the like -- where you aren't going to be adding new events to the history -- one possible answer is to do the slow work in advance. An asynchronous process listens for writes in the event store, and then publishes those events to a bus. Subscribers build new versions of the reports when new events are observed, and cache the results. (search keyword: cqrs)
When a client asks for a report, you give them one out of the cache. All the work is done, so it's very quick.
For command handlers, the answer is more complicated. Business rules are supposed to be in the domain model, so having the command handler try to validate the command (as opposed to the domain model) is a bit broken.
The command handler can load the products to see what the state might look like, and pass that information to the aggregate with the command data, but it's not clear that's a good idea -- if the client is going to send a command to be run, and you need to flesh out the Order command with Product data, why not instead have the command add the product data to the command directly, and skip the middle man.
CommandHandler also can use this Product object to perform logic such as whether there are too many particular products, etc.
This example is a bit vague, but taking a guess: you are thinking about cases where you prevent an order from being placed if the available inventory is insufficient to fulfill the order.
For real world inventory - a physical book in a warehouse - that's probably the wrong approach to take. First, the model itself is wrong; if you want to know how much product is in the warehouse, you should be querying the warehouse, not the product. Second, a physical warehouse is not constrained by your model -- calling the addProduct method on a warehouse aggregate doesn't cause the product to magically appear there.
Third, it probably doesn't match very well with what your domain experts want anyway. If the model says that the warehouse doesn't have enough product, do you think the stake holders want the system to
tell the shopper to buy the product somewhere else, or...
accept the order, and contact the supplier for a new delivery.
Hint: when in doubt, carefully review how amazon.com does it.

Database schema for rewarding users for their activities

I would like to provide users with points when they do a certain thing. For example:
adding article
adding question
answering question
liking article
etc.
Some of them can have conditions like there are only points for first 3 articles a day, but I think I will handle this directly in my code base.
The problem is what would be a good database design to handle this? I think of 3 tables.
user_activities - in this table I will store event types (I use
laravel so it would probably be the event class name) and points for
specific event.
activity_user - pivot table between user_activities and users.
and of course users table
It is very simple so I am worrying that there are some conditions I haven't thought of, and it would come and bite me in the future.
I think you'll need a forth table that is simply "activities" that is simply a list of the kinds of activities to track. This will have an ID column, and then in your user_activities table include an 'activity_id' to link to that. You'll no doubt have unique information for each kind, for example an activities table may have columns like
ID : unique ID per laravel
ACTIVITY_CODE : short code to use as part of application/business logic
ACTIVITY_NAME : longer name that is for display name like "answered a question"
EVENT : what does the user have to do to trigger the activity award
POINT_VALUE: how many points for this event
etc
If you think that points may change in the future (eg. to encourage certain user activities) then you'll want to track the actual point awarded at the time in the user activities table, or some way to track what the points were at any one time.
While I'm suggesting fourth table, what you really need is more carefully worded list of features to be implemented before doing any design work. My example of allowing for points awarded to change over time is such a feature that you don't mention but you'll need to design for if this feature is needed.
Well I have found this https://laracasts.com/lessons/build-an-activity-feed-in-laravel as very good solution. Hope it helps someone :)

How does one validate a partial record when using EF/Data Annotations?

I am updating a record over multiple forms/pages like a wizard. I need to save after each form to the database. However this is only a partial record. The EF POCO model has all data annotations for all the properties(Fields), so I suspect when I save this partial record I will get an error.
So I am unsure of the simplest solution to this.
Some options I have thought of:
a) Create a View Model for each form. Data Annotations on View model instead of EF Domain Model.
b) Save specific properties, rather than SaveAll, in controller for view thereby not triggering validation for non relevant properties.
c) Some other solution...??
Many thanks in Advance,
Option 1. Validation probably (especially in your case) belongs on the view model anyway. If it is technically valid (DB constraint wise) to have a partially populated record, then this is further evidence that the validation belongs on the view.
Additionally, by abstracting the validation to your view, you're allowing other consuming applications to have their own validation logic.
Additional thoughts:
I will say, though, just as a side note that it's a little awkward saving your data partially like you're doing, and unless you have a really good reason (which I originally assumed you did), you may consider holding onto that data elsewhere (session) and persisting it together at the end of the wizard.
This will allow better, more appropriate DB constraints for better data integrity. For example, if a whole record shouldn't allow a null name, then allowing nulls for the sake of breaking your commits up for the wizard may cause more long term headaches.

Resources