I am new to Event Sourcing and I have encountered an example which I am not quite sure the pros and cons of different approaches.
Let's say this is a bank example, I have three entities Account, Deposit and Transfer.
My idea is, when a use deposits, command bank.deposit will create two events:
deposit.created and account.deposited. Can I or should I include the deposit.created event uuid in account.deposited as a reference?
Taking to the next step, if later the bank has a transfer feature, should I made a separate event account.transfer_received or I should created a more general event account.credited to be used by both deposit and transfer?
Thanks in advance.
A good article to review is Nobody Needs Reliable Messaging. One key observation is that you often need identifiers at the domain level.
For instance, when I look at my bank account, and see that the account history includes a specific deposit, there is an identifier for the deposit that is reported in the view.
If you imagine it from an event sourced perspective, before the deposit the balance was X, and the history did not include deposit 12345; after processing the deposit, the balance was X+Y and deposit 12345 was in the account history.
(This means, among other things, that if a second copy of deposit 12345 were to appear, the domain model would know to ignore it even if the identifier for the event were different).
Now, there are reasons that you might want to keep various message ids around. See Hohpe's work on Enterprise Integration Patterns; in particular Correlation Identifier.
should I made a separate event
Usually. "Make the implicit, explicit". The fact that two events happen to have similar representations is not a reason to blur them when the ubiquitous language distinguishes the two.
It's somewhat analogous, in motivation, to providing a task based ui or eschewing the user of generic repositories.
This is a general architectural question on ES. The concern is generally about the need to keep a great amount of business non-important events, that affect intermediate state, though we definitely won't care about them (will just ignore them) at the end of the day.
Say we have a User, that has a list of items (i.e Tasks), and the user may quite often add/remove/edit different fields of a task. If we are building ES, we should treat each update as an individual event for example TaskNameChange, TaskCommentChange etc, or we may have one event TaskModified whatever. In our case tasks state changes are actually not important for us, is we don't get much from task change history, from the business standpoint we will ever care about only last ones (for example last TaskNameChange), but we should anyway track and record all the events.
Again my concern is that we should record and keep a great amount of business meaningless events in event store.
Has anyone met such situation and what are ideas about it?
Has anyone met such situation and what are ideas about it?
Horses for courses
If the costs associated with keeping a complete event backed history of your document exceed the business value that you can accrue from that history, then don't design your system to keep all of the history. Set up a document store, on each save overwrite the previous version of the document, and get on with it.
Greg Young: a whole system based on event sourcing is an anti pattern.
I am designing some events that will be raised when actions are performed or data changes in a system. These events will likely be consumed by many different services and will be serialized as XML, although more broadly my question also applies to the design of more modern funky things like Webhooks.
I'm specifically thinking about how to describe changes with an event and am having difficulty choosing between different implementations. Let me illustrate my quandry.
Imagine a customer is created, and a simple event is raised.
<CustomerCreated>
<CustomerId>1234</CustomerId>
<FullName>Bob</FullName>
<AccountLevel>Silver</AccountLevel>
</CustomerCreated>
Now let's say Bob spends lots of money and becomes a gold customer, or indeed any other property changes (e.g.: he now prefers to be known as Robert). I could raise an event like this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<FullName>Bob</FullName>
<AccountLevel>Gold</AccountLevel>
</CustomerModified>
This is nice because the schema of the Created and Modified events are the same and any subscriber receives the complete current state of the entity. However it is difficult for any receiver to determine which properties have changed without tracking state themselves.
I then thought about an event like this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<AccountLevel>Gold</AccountLevel>
</CustomerModified>
This is more compact and only contains the properties that have changed, but comes with the downside that the receiver must apply the changes and reassemble the current state of the entity if they need it. Also, the schemas of the Created and Modified events must be different now; CustomerId is required but all other properties are optional.
Then I came up with this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<Before>
<FullName>Bob</FullName>
<AccountLevel>Silver</AccountLevel>
</Before>
<After>
<FullName>Bob</FullName>
<AccountLevel>Gold</AccountLevel>
</After>
</CustomerModified>
This covers all bases as it contains the full current state, plus a receiver can figure out what has changed. The Before and After elements have the exact same schema type as the Created event. However, it is incredibly verbose.
I've struggled to find any good examples of events; are there any other patterns I should consider?
You tagged the question as "Event Sourcing", but your question seems to be more about Event-Driven SOA.
I agree with #Matt's answer--"CustomerModified" is not granular enough to capture intent if there are multiple business reasons why a Customer would change.
However, I would back up even further and ask you to consider why you are storing Customer information in a local service, when it seems that you (presumably) already have a source of truth for customer. The starting point for consuming Customer information should be getting it from the source when it's needed. Storing a copy of information that can be queried reliably from the source may very well be an unnecessary optimization (and complication).
Even if you do need to store Customer data locally (and there are certainly valid reasons for need to do so), consider passing only the data necessary to construct a query of the source of truth (the service emitting the event):
<SomeInterestingCustomerStateChange>
<CustomerId>1234</CustomerId>
</SomeInterestingCustomerStateChange>
So these event types can be as granular as necessary, e.g. "CustomerAddressChanged" or simply "CustomerChanged", and it is up to the consumer to query for the information it needs based on the event type.
There is not a "one-size-fits-all" solution--sometimes it does make more sense to pass the relevant data with the event. Again, I agree with #Matt's answer if this is the direction you need to move in.
Edit Based on Comment
I would agree that using an ESB to query is generally not a good idea. Some people use an ESB this way, but IMHO it's a bad practice.
Your original question and your comments to this answer and to Matt's talk about only including fields that have changed. This would definitely be problematic in many languages, where you would have to somehow distinguish between a property being empty/null and a property not being included in the event. If the event is getting serialized/de-serialized from/to a static type, it will be painful (if not impossible) to know the difference between "First Name is being set to NULL" and "First Name is missing because it didn't change".
Based on your comment that this is about synchronization of systems, my recommendation would be to send the full set of data on each change (assuming signal+query is not an option). That leaves the interpretation of the data up to each consuming system, and limits the responsibility of the publisher to emitting a more generic event, i.e. "Customer 1234 has been modified to X state". This event seems more broadly useful than the other options, and if other systems receive this event, they can interpret it as they see fit. They can dump/rewrite their own data for Customer 1234, or they can compare it to what they have and update only what changed. Sending only what changed seems more specific to a single consumer or a specific type of consumer.
All that said, I don't think any of your proposed solutions are "right" or "wrong". You know best what will work for your unique situation.
Events should be used to describe intent as well as details, for example, you could have a CustomerRegistered event with all the details for the customer that was registered. Then later in the stream a CustomerMadeGoldAccount event that only really needs to capture the customer Id of the customer who's account was changed to gold.
It's up to the consumers of the events to build up the current state of the system that they are interested in.
This allows only the most pertinent information to be stored in each event, imagine having hundreds of properties for a customer, if every command that changed a single property had to raise an event with all the properties before and after, this gets unwieldy pretty quickly. It's also difficult to determine why the change occurred if you just publish a generic CustomerModified event, which is often a question that is asked about the current state of an entity.
Only capturing data relevant to the event means that the command that issues the event only needs to have enough data about the entity to validate the command can be executed, it doesn't need to even read the whole customer entity.
Subscribers of the events also only need to build up a state for things that they are interested in, e.g. perhaps an 'account level' widget is listening to these events, all it needs to keep around is the customer ids and account levels so that it can display what account level the customer is at.
Instead of trying to convey everything through payload xmls' fields, you can distinguish between different operations based on -
1. Different endpoint URLs depending on the operation(this is preferred)
2. Have an opcode(operation code) as an element in the xml file which tells which operation is to used to handle the incoming request.(more nearer to your examples)
There are a few enterprise patterns applicable to your business case - messaging and its variants, and if your system is extensible then Enterprise Service Bus should be used. An ESB allows reliable handling of events and processing.
I am trying to understand how to use the FHIR Questionnaire resource, and have a specific question regarding this.
My project is specifically regarding how a citizen in our country could be responding to Questionnaires via a web app, which are then submitted to the FHIR server as QuestionnaireAnswers, to be read/analyzed by a health professional.
A FHIR-based system could have lots of Questionnaires (Qs), groups of Qs or even specific Qs would be targeted towards certain users or groups of users. The display of the questionnare to the citizen could also be based on a Care-plan of a sort, for example certain Questionnaires needing filling-in in the weeks after surgery. The Questionnaires could also be regular ones that need to be filled in every day or week permanently, to support data collection on the state of a chronic disease.
What I'm wondering is if FHIR has a resource which fits into organizing the 'logistics' of displaying the right form to the right person. I can see CarePlan, which seems to partly fit. Or is this something that would typically be handled out-of-FHIR-scope by specific server implementations?
So, to summarize:
Which resource or mechanism would a health professional use to set up that a patient should answer certain Questionnaires, either regularly or as part of for example a follow-up after a surgery. So this would include setting up the schedule for the form(s) to be filled in, and possibly configure what would happen if the form wasn't filled in as required.
Which resource (possibly the same) or mechanism would be used for the patient's web app to retrieve the relevant Questionnaire(s) at a given point in time?
At the moment, the best resource for saying "please capture data of type X on schedule Y" would be DiagnosticOrder, though the description probably doesn't make that clear. (If you'd be willing to click the "Propose a change" link and submit a change request for us to clarify, that'd be great.) If you wanted to order multiple questionnaires, then CarePlan would be a way to group that.
The process of taking a complex schedule (or set of schedules) and turning that into a simple list of "do this now" requests that might be more suitable for a mobile application to deal with is scheduled for DSTU 2.1. Until then, you have a few options for the mobile app:
- have it look at the CarePlan and complex DiagnosticOrder schedule and figure things out itself
- have a server generate a List of mini 1-time DiagnosticOrders and/or Orders identifying the specific "answer" times
- roll your own mechanism using the Other/Basic resource
Depending on your timelines, you might want to stay tuned to discussions by the Patient Care and Orders and Observations work groups as they start dealing with the issues around workflow management starting next month in Atlanta.
The team I manage has been using TFS for years, but we've used a 3rd party system for tracking bugs and features. I'm looking to upgrade our team into TFS 2013 and I've done tons of reading and research into how TFS manages work items, backlogs, iterations, tasks etc. And although I understand the principles of what 'can' be done, I'm having a hard time visualizing 'how' our team would work with these work items as tasks.
If anyone knows of any best practice guides for actual sample based usage, or can answer any of these questions that'd be great
1) Product backlog - Under the 'configure schedule and iterations' what is the concept for setting the current 'backlog iteration'? Our team uses short 2 week iterations with a build number, but setting the build iteration as the current backlog makes all new PBI's scoped to only that iteration. Any items not complete in that iteration would disappear once I set the current build to the next iteration number. On the other hand, if I set it to the parent root node, I could see the PBI list getting rather large over time. What is the best method for managing PBI's that are unassigned and working in a simple Parent->build1/build2 etc structure?
2) Features - So I create a feature, perhaps it spans many work items and several tasks. They get completed over time, but I've noticed there's no 'auto' complete or status updates on parent items. So who/when is a Feature item supposed to get marked complete? If the product owner is supposed to use the features list to get an overview of work, they have no idea if all the dependent items have been complete and when to mark the feature Done.
3) Work Items - Managing these, and in particular their 'state' or status seems like a royal pain. On the task board you can't change their state, only their tasks with drag-drop, which is nice. But you complete all the tasks, and the parent work item stays in status 'New'. Do you really have to micro-manage every work item, open it up, and set the state to Done?
4) QA/testing - For every work item, each team member is responsible for testing each item, so every item is tested by multiple people, and logging any issues found. What's the best way to use work items or tasks for this?
5) Build Complete - Once every work item in the iteration is marked Done then I assume they are removed from the product backlog correct? The exception to this seems to be the features they were tied to, the feature item itself remains open. How do stakeholders view a list of features that were completed in the current build?
I can't answer everything (indeed, there is no one "right" answer), but here's how my team uses TFS - it might give you some ideas:
We use Area Path to represent a Project or Epic that work belongs under. When a work item is created it is assigned to a project using the Area Path, and it never changes.
Then to represent "when" work is done we use a hierarchical iteration path under 3 headings (for a project called "Project"): Project\Completed, Project\Current, Project\Future.
Stories in the product backlog are initially assigned to Future (We go a bit further in fact and use New stories to represent "proposed" work, and Active ones to represent the "approved" backlog - this allows us to plan tentative projects/contracts that convert into real work when they get the green light). At this stage we do Planning Poker to get Story Points and then the Project Managers assign stack ranks to the stories to help to decide what to move from Proposed to Product backlog, and then eventually what we should think about for the next iteration.
When we start an iteration we create a new iteration (call it 001) under Future, i.e. Project\Future\001. Then Stories are chosen from the Product Backlog for implementation - they get assigned to this iteration. When the iteration is ready to start, we use a "conveyor belt" approach which moves all the iterations along one "place" in the hierarchy: In the Iteration Path configuration UI, just drag the 001 iteration from Future to Current. This re-paths everything in that path automatically so that all the active work is instantly under Project\Current.
As we complete the iteration, we would have Current\001 and we'd then add Future\002. Then we move 001 and 002 along the conveyor (to Project\Completed\001 and Project\Current\002 respectively). This way the work gets assigned to one iteration and stays there, but the iteration as a whole moves from future ...to current ...to completed. This allows us to build queries like "all current work" (all work under "Project\Current") that we don't need to rewrite for every iteration, and this saves a massive amount of time and eliminates a lot of mistakes trying to re-assign iteration paths constantly - in most cases the iteration is only changed once (from future to an actual iteration).
When a story moves into the current iteration, we choose an implementing team (e.g. an owner to accept delivery, and a developer and a tester to implement the work) and those people add tasks for any work that needs to be done to deliver the story. Any bugs/issues that crop up for that story during the iteration are also parented to the Story or Tasks.
We found the TFS tools pretty poor (clumsy, slow, micro-managing), so we now use a home-built dashboard that shows us a list of stories, so in our scrum we can step through the stories and see the tasks/bugs/issues for each, who is working on them, and how much work they reported on the task since the last scrum. This gives us a really clear basis to discuss the story.
We close tasks/bugs/issues as we complete them, but the story stays open till the end of the iteration (so that any new bugs found can be attached and dealt with). We then use a custom tool to "Resolve" the story, which closes all the child work items, and then checks if the parent Feature or Epic is now completed and can also be marked "Resolved". This can also be done in stock TFS just using a manual process, but it is rather laborious, and the code to automate it is only an hour or two's work. I really don't understand why TFS makes you essentially update all the database tables by hand when it's so easy to automate. (In a similar way, the TFS kanban is unnecessarily time consuming to manage because items only appear on it if they are perfectly formed - get any of the estimate, remaining, completed, area, iteration, assigned-to, parent link, etc wrong and it vanishes! So I've written e.g. a simple 'create task' tool that asks for the estimate, assignee and title, and fill in the rest - this took me a couple of hours to implement and has eradicated all the time consuming errors and hassle of using TFS 'raw')
When processing tasks, TFS provides 'Activity' states (planning, development, testing, documentation etc) - which implies that each single task will be passed linearly through a chain of different people to be implemented... but we feel this is a poor approach, because we want to encourage the team working on a story to work in parallel and work together, not "throw their bit over the fence to the next guy". So instead each person on the team creates one or more tasks under the story that represent the parcels of work (programming, testing, documenting) they must personally do to deliver the story, and each task only ever has one owner. (This works well in our scrum dashboard because it shows the story and its list of child tasks/bugs/issues, so the entire context of the story's work can be seen easily at a glance). The separate tasks allow the programmer and tester to work together in a tight, iterative, co-operative agile loop, often with progressive roll-out of parts of the feature for testing, rather than the programmer finishing all his work before passing the complete article over to the tester in a waterfall-y way. At the end of the iteration, the story-team demos their story to the wider development team, and they are all equally responsible for ensuring that everything needed is delivered. After the demo, the Product Owner/Champion then accepts the work as done (or rejects it). This vastly reduces the amount of work that gets dropped "between the cracks" where people think somebody else will do it, helping us to get to a solid delivery at the end. We've found communication within the team and story delivery significantly improved since we moved to this approach.
I should mention that to get good estimates and burn-downs we try to keep each task less than 5 days work, and to avoid micro-management we try to avoid splitting down tasks into anything under about 2 days (though obviously some tasks are necessarily shorter).
As I've mentioned, we log bugs/issues as children of the task or story they affect (and can also add Related links if they impact more than one story). At the end of the iteration as well as demoing the new features to the rest of the team, the release build is regression-tested as a whole. Any bugs found are fixed in a release branch and within (hopefully) a day or two we have a stable customer release. We aim to have a product of customer-releasable quality from every iteration, and to keep the number of outstanding bugs per developer below 5 (usually 1-3). Before introducing this system, we had an ongoing average of 20 bugs per developer, an unpleasant technical debt. (Note: we reserve some time in every iteration for fixing these bugs, but when bugs are too gnarly to fix then-and-there, we usually convert them into new stories so that they can be estimated and scheduled for a future iteration just like other work, so the bug-list and technical debt is never allowed to build up, and where possible bug fixing is not allowed to derail our iteration plan.
We don't treat work in progress (items in an iteration) as Product Backlog - the product backlog is work that we plan to do in the future, and when it moves into an iteration it becomes actively worked on and no longer in the "to do" list (it's the Iteration backlog, not the Product backlog). When all of the work (task/bug) is complete, then the parent story can be Resolved ('we think it is "done"') and then Closed ('the Product Owner accepts it as "done"') and so a simple query (work under Project\Current that is Closed) will tell you what you have delivered this iteration.
Lastly when we close out the iteration, the whole iteration moves into Project\Completed, so then you can easily query all of the work which has ever been completed (under Project\Completed), and still grouped within their individual iterations. So at any time if you want to know what "Build 107" added, you can just do a query for all Closed stories under iteration path Project\Completed\107. We mark incomplete/abandoned work as Removed, so for us Closed means "Done". If work is not completed in one iteration and is continued in the next, then we simply move the story to the next iteration, and so the completed work then shows up in any queries for "Build 108" instead - so this perfectly tracks the achieved deliveries for an iteration.
To keep things consistent, only a few team members can change different types of item. So our "planning items" (Epics, Features, Stories) are only changed by the Project Manager or Product Owners. Tasks are all owned and thus created/changed/closed by the developer that is doing the work. PMs track progress of stories and devs track progress of tasks.
1) Product backlog - Under the 'configure schedule and iterations'
what is the concept for setting the current 'backlog iteration'? Our
team uses short 2 week iterations with a build number, but setting the
build iteration as the current backlog makes all new PBI's scoped to
only that iteration. Any items not complete in that iteration would
disappear once I set the current build to the next iteration number.
On the other hand, if I set it to the parent root node, I could see
the PBI list getting rather large over time. What is the best method
for managing PBI's that are unassigned and working in a simple
Parent->build1/build2 etc structure?
TFS has two different backlogs. The Product Backlog of your team and the Sprint backlog of your team. In the iteration configuration screen you define which iteration contains your teams product backlog (by setting the Backlog iteration) and which iterations below that will represent your sprints .
If you have a large list of PBI's you could put these either in an iteration above the current backlog iteration, which will effectively hide them from the backlog pages. Or you can place them in a separate iteration that is a sibbling of your Backlog iteration.
2) Features - So I create a feature, perhaps it spans many work items
and several tasks. They get completed over time, but I've noticed
there's no 'auto' complete or status updates on parent items. So
who/when is a Feature item supposed to get marked complete? If the
product owner is supposed to use the features list to get an overview
of work, they have no idea if all the dependent items have been
complete and when to mark the feature Done.
There is no auto-complete or auto-close. Normally the Product Owner (scrum role) will keep an eye out on what he has requested and knows just about when a feature is about to be completed.
You can also view the hierarchy of Product backlog items to features in the Product Backlog view by selecting the Features to Backlog Items view. This will also list the states of the underlying stories:
3) Work Items - Managing these, and in particular their 'state' or
status seems like a royal pain. On the task board you can't change
their state, only their tasks with drag-drop, which is nice. But you
complete all the tasks, and the parent work item stays in status
'New'. Do you really have to micro-manage every work item, open it up,
and set the state to Done?
Normally the product owner/project manager will approve stories for pickup and move them from new to approved. Then during the Sprint planning meeting (or at the start of a sprint), the team selects which items they will work on and will move these from Approved to Committed.
Then at the end of the sprint (or when all tasks under a story are done), the development team shows the product owner the finished work and then moves the story to done as well.
4) QA/testing - For every work item, each team member is responsible
for testing each item, so every item is tested by multiple people, and
logging any issues found. What's the best way to use work items or
tasks for this?
Depends on the maturity of the team. And depends on your adoption of Test manager (Test Case work item). If your team is pretty mature and is using Test manager to link Test Cases to your Stories, then you can view the status of your tests in Web Access. If the team consistently works in a ATDD way of working, they'll do the work needed to make a test succeed before moving on to the next piece of work. In such a workflow it's not really needed to create "design-build-test" work items. The work item would probably be akin to "Make test X pass" and would include all the work to create the test, build the code and make the test pass.
5) Build Complete - Once every work item in the iteration is marked
Done then I assume they are removed from the product backlog correct?
The exception to this seems to be the features they were tied to, the
feature item itself remains open. How do stakeholders view a list of
features that were completed in the current build?
Again, use the Feature to Product Backlog Item view to see which features have had all their work finished. The stakeholders mentally verify that this was indeed all they wanted and that they have no additional requests, work, feedback that is needed to truly complete the feature. If this is the case they will close the feature by moving it to done.