Event source the whole system is bad - microservices

I'm learning a proper microservice architecture using CQRS, MassTransit and different type of storage for the read side. One thing which often comes along CQRS is the event sourcing. I do understand it's not mandatory at all. However, I can't think of why using it on the whole system is really an anti pattern.
Having an store for all events as a single source of truth can help you build / rebuild a read store on the fly whenever you want.
You are not locked in to any vendor (except for the event store)
For me, the question is more like is it easier to not start with event sourcing (and still have separate data storage depending on which the microservices. eg: elasticsearch, mongodb, etc etc) and migrating / provisioning whenever it's needed or on the other hand, start with event sourcing everything so that you don't have to deal with migration later on.

I can't think of why using it on the whole system is really an anti pattern.
I agree -- calling it an "anti pattern" is an overstatement.
The spelling I believe? Using event sourcing on the whole system isn't cost effective today.
It could be tomorrow, as we get more practice with it, and the costs of designing these systems goes down and we learn to extract more benefit from them.
In the mean time - how valuable are the temporal queries that you get from event sourcing? In your core domain, where you get competitive advantage, they could be quite valuable. In places where you are just doing bookkeeping of information provided to you by the outside world? Not so much - you may be getting everything you need out of simpler solutions that only keep track of "now".

I recently published a blog post about this issue. It explains why event sourcing is a persistence strategy and shouldn't be used at global scale.
To summarize it: Event Sourcing forces you to emit an event for every changed data. This can result in very fine grained events. If you use Event Sourcing for inter microservice communication, you expose those events to the outside world.
In the end you expose the your persistence layer, comparable to exposing your (relational) database schema in a CRUD based persistence strategy.

Related

Eventual consistency - Axon conflict resolver

I'm working on a PoC to evaluate the use of Axon framework for the development of a new application.
My concern is about the eventual consistency with the CQRS pattern since consistency is a requirement for us.
There are a lot of articles and threads about this topic, so I apologize if I'm creating a duplicate thread.
Axon offer a conflict resolver but I'm not sure to understand how it works.
I found an example on a open source project.
This solution stores the version of the aggregate in the event store and read model. The client will read then the version from the read model.
What if I have different read models, could there be version conflicts?
How does Axon solve the conflicts?
Thanks
Before we dive into how Axon deals with consistency, there are a few things that I'd like to point out in the context of CQRS as a concept.
There is a lot of misconception around consistency in combination with CQRS. The concept of eventual consistency applies between the different models that you have defined within your application. For example, a Command Model may have changed state recently, but the Query Model doesn't reflect that state yet. The Query Model is eventually consistent with the Command Model. However, the information within that Query Model is still consistent in itself.
More importantly, this allows you to make conscious choices around where consistency is important and where it can be relaxed. Typically, Command Models make decisions in which consistency is important. You'd want to make sure each decision is made with the relevant knowledge of recent changes. That's the purpose of the Aggregate. An Aggregate will always make decisions that are consistent with its state.
I recommend reading up on the Reactive Principles document [1], namely Section V [2].
Then Axon. Axon implements the concepts of DDD and CQRS very strictly. Consistency is sacred within an Aggregate. For example, when using Event Sourcing, the events with an Aggregate's stream are guaranteed to have been generated based on a State that included all previous events in that stream. In other words, event number 9 in the stream was created with the knowledge of events number 0 through 8. Guaranteed.
When events are published, this doesn't mean any projections are already up to date. This may take a few milliseconds. Relaxing consistency here allows us to scale our system. The only downside is that a user may execute a command, perform a query and not see the results yet. This is actually much more common in systems than you think. There are numerous ways to prevent this from being a problem. Updating user interfaces in real-time is a powerful way of working with this. Then it doesn't matter which user made the change; they see it practically immediately.
The other way round may pose a challenge. A user observes the system state through a Query. This may (and always will, even without CQRS) provide stale data; the data may have been altered while the user is watching it. The user decides to make a change. However, in parallel, the information has already been changed. This other change may be such that, had the user known, it would have never submitted that Command.
In Axon, you can use Conflict Resolvers to detect these "unseen" parallel actions. You can use the "aggregate sequence" from incoming events and store them with your projection. If a user action results in a Command towards that aggregate, pass the Aggregate Sequence as Expected Aggregate Version. If the actual Aggregate's version doesn't match this (because it has been altered in the meantime), you get to decide whether that is problematic. There is a short explanation in the Reference Guide [3].
I hope this sheds some light on consistency in the context of CQRS and Axon.
[1] https://principles.reactive.foundation
[2] https://principles.reactive.foundation/principles/tailor-consistency.html
[3] https://docs.axoniq.io/reference-guide/axon-framework/axon-framework-commands/modeling/conflict-resolution

Amount of properties per command/event in event sourcing

I'm learning cqrs/event sourcing, and recently I listen some speach and speaker told that you need pass as few parameters to event as possible, in other words to make events tiny as possible. The main reason for that is it's impossible to change events later as it will break the event history, and its easelly to design small events correctly. But what if for example in UI you need fill in for example form with 10 fields to create new aggregate, and same situation can be with updating the aggregate? How to be in such a case? And how to be if business later consider to change something, but we have huge event which updating 10 fields?
The decision is always context-specific and each case deserves its own review of using thin events vs fat events.
The motivation for using thin domain events is to include just enough information that is required to ensure the state transition.
As for fat events, your projections might require a piece of entity state to avoid using any logic in the projection itself (best practice).
For integration, you'd prefer emitting fat events because you hardly know who will consume your event. Still, the content of the event should convey the information related to the meaning of the event itself.
References:
Putting your events on a diet
Patterns for Decoupling in Distributed Systems: Fat Event
recently I listen some speach and speaker told that you need pass as few parameters to event as possible, in other words to make events tiny as possible.
I'm not convinced that holds up. If you are looking for good ides about designing events, you should review Greg Young's e-book on versioning.
If you are event sourcing, then you are primarily concerned with ensuring that your stream of events allows you to recreate the state of your domain model. The events themselves should be representations of changes that a domain expert will recognize. If you find yourself trying to invent smaller events just to fit some artificial constraint like "no more than three properties per event" then you are going to end up with data that doesn't really match the way your domain experts think -- which is to say, technical debt.

Does an append-only event store result in an append-only codebase?

When implementing an application with event sourcing, the persistence engine at work is an event store. That is, an append-only log of events, in past tense, in the order or occurrence. By simply replaying the events through the application, the state at any point in time can be reproduced.
My concern – doesn't this append-only event store inevitably lead to an append-only codebase? How can you maintain a codebase if removing, or even altering code, might leave the application unable to replay the sequence of events? Can the number of source lines of code ever decrease?
What if a business rule has to be modified, or perhaps worse, what if a nasty bug early in the early days of the application allowed it to enter into a forbidden state? Must the faulty code be kept alive indefinitely? Of course, a lot of these issues can – in theory – be dealt with using event versioning, event schemas, snapshot versioning etc. But hasn't event sourcing become a burden at that point?
Event sourcing is a fairly new technology, at least in production. I suspect that there are few applications that have been running on it for more than a couple of years. What will they look like in 10 years? That's not an unrealistic age for an enterprise application.
My concern – doesn't this append-only event store inevitably lead to an append-only codebase?
No, it implies an append-only schema, which is decoupled from your implementation.
What if a business rule has to be modified, or perhaps worse, what if a nasty bug early in the early days of the application allowed it to enter into a forbidden state? Must the faulty code be kept alive indefinitely?
Not really - the domain is decoupled from the durable representations.
Yes, there are some common scenarios that you need to incorporate into your design; like the idea that you may need to compensate for errors earlier in the event history.
It's not, fundamentally, different from what you would do if you were only storing current state. If you have a representation of an aggregate in your database that is in the wrong state, you just update it in place, right? by changing some of the fields to what they are supposed to be.
The idea is the same in event sourcing; you have an event stream that produces a state that you don't want to be in. You figure out what additional events are necessary to reach the state you should be in, and append them. Tada.
Of course, a lot of these issues can – in theory – be dealt with using event versioning, event schemas, snapshot versioning etc. But hasn't event sourcing become a burden at that point?
Not really? Yes, you need to design flexibility into your schema, so that you can evolve your model aggressively, but at it's core it's not different from storing current state - you can still migrate if you have to.
But you also have other levers to play with.
It does, perhaps, require more upfront design capital - you have to think about things like schema lifetimes, and the fact that your book of record accumulates data from multiple revisions of your model.
That doesn't mean it's a shoe for all feet. Designing good message schema is an investment. If the consumers of that schema (which in this case really means your model, and the subscribers) don't need to evolve independently, then maybe that investment doesn't make sense.

Backing a user interface with a state machine

I am developing a web application with a somewhat complex user interface. It seems like it might be a good idea to back the UI with a corresponding state machine, defining the transitions possible between various states and the corresponding behavior.
The perceived benefits are that the code for controlling the behavior is structured consistently, and that the state of the UI can be persisted and resumed easily.
Can anyone who has tried this lend any insights into this approach? Are there any pitfalls I need to be aware of?
Off the top of my head, these are a bit obvious, but still, as nobody replied anything:
i'd advise to persist the state of the application server side, indexed via a session variable/user id for security and flexibility reasons;
interfaces are better modeled by an event-based approach IMHO, but this is a bit dependent on what layer of the UI you're developing, and also on your language of choice for development. You may be able to store some logic on item triggers and items themselves.
By event-based approach, i refer somewhat to this technique, which some "more visual" oriented environments (adobe flex, oracle forms and also html, in a sort of limited fashion) use. In a nutshell, you have triggers (item.on_click, label.on_mouse_over, text_field.on_record_update) which you use to drive the states of the interface.
One very common caveat of this kind of approach (distributed control) is endless loops: you have an item that enables another item, which when enabled fires its own triggers and eventually gets the first item to fire that same trigger again. This is quite often not obvious when developing, but very common to detect when testing.
Some languages/environments offer some protection against the more obvious cases, but this is something to be on the lookout for.
This is probably useful for your approach.

CF Project getting too big, what shall one do?

A simple billing system (on top of ColdBox MVC) is ballooning into a semi-enterprisey inventory + provisioning + issue-tracking + profit tracking app. They seem to be doing their own thing yet they share many things including a common pool of Clients and Staff (login's), and other intermingled data & business logic.
How do you keep such system modular? from a maintenance, testability & re-usability stand point?
single monolithic app? (i.e. new package for the base app)
ColdBox module? not sure how to make it 'installable' and what benefits does it bring yet.
Java Portlet? no idea, just thinking outside the box
SOA architecture? through webservice API calls?
Any idea and/or experience you'd like to share?
I would recommend you break the app into modular pieces using ColdBox Modules. You can also investigate on separate business logic into a RESTful ColdBox layer also and joining the system that way also. Again, it all depends on your requirements and needs at the moment.
Modules are designed to break monolithic applications into more manageable parts that can be standalone or coupled together.
Stop thinking about technology (e.g. Java Portals, ColdBox modules, etc...) and focus on architecture. By this I mean imagining how you can explain your system to an observer. Start by drawing a set of boxes on a whiteboard that represent each piece - inventory, clients, issue tracking, etc... - and then use lines to show interactions between those systems. This focuses you on a separation of concerns, that is grouping together like functionality. To start don't worry about the UI, instead focus on algorithms and data.
If you we're talking about MVC, that step is focusing on the model. With that activity complete comes the hard part, modifying code to conform to that diagram (i.e the model). To really understand what this model should look like I suggest reading Domain Driven Design by Eric Evans. The goal is arriving at a model whose relationships are manageable via dependency injection. Presumably this leaves you with a set of high level CFCs - services if you will - with underlying business entities and persistence management. Their relationships are best managed by some sort of bean container / service locator, of which I believe ColdBox has its own, another example is ColdSpring.
The upshot of this effort is a model that's unit testable. Independent of of the user interface. If all of this is confusing I'd suggest taking a look at Working Effectively with Legacy Code for some ideas on how to make this transition.
Once you have this in place it's now possible to think about a controller (e.g. ColdBox) and linking the model to views through it. However, study whatever controller carefully and choose it because of some capability it brings to the table that your application needs (caching is an example that comes to mind). Your views will likely need to be reimagined as well to interact with this new design, but what you should have is a system where the algorithms are now divorced from the UI, making the views' job easy.
Realistically, the way you tackle this problem is iteratively. Find one system that can easily be teased out in the fashion I describe, get it under unit tests, validate with people as well, and continue to the next system. While a tedious process, I can assure it's much less work than trying to rewrite everything, which invites disaster unless you have a very good set of automated validation ahead of time.
Update
To reiterate, the tech is not going to solve your problem. Continued iteration toward more cohesive objects will.
Now as far as coupled data, with an ORM you've made a tradeoff, and monolithic systems do have their benefits. Another approach would be giving one stateful entity a reference to another's service object via DI, such that you retrieve it through that. This would enable you to mock it for the purpose of unit testing and replace it with a similar service object and corresponding entity to facilitate reuse in other contexts.
In terms of solving business problems (e.g. accounting) reuse is an emergent property where you write multiple systems that do roughly the same thing and then figure out how to generalize. Rarely if ever in my experience do you start out writing something to solve some business problem that becomes a reusable component.
I'd suggest you invest some time in looking at Modules. It will help with partitioning your code into logical features whilst retaining the integration with the Model.
Being ColdBox there is loads of doc's and examples...
http://wiki.coldbox.org/wiki/Modules.cfm
http://experts.adobeconnect.com/p21086674/
You need to get rid of the MVC and replace it with an SOA architecture that way the only thing joining the two halves are the service requests.
So on the server side you have the DAO and FACADE layers. And the client side can be an MVC or what ever architecture you want to use sitting somewhere else. You can even have an individual client for each distinct business.
Even for the server side you can break the project down into multiple servers: what's common between all businesses and then what's distinct between all of them.
The problem we're facing here luckily isn't unique.
The issue here seems not to be the code itself, or how to break it apart, but rather to understand that you're now into ERP design and development.
Knowing how best to develop and grow an ERP which manages the details of this organization in a logical manner is the deeper question I think you're trying to get at. The design and architecture itself of how to code from this flows from an understanding of the core functional areas you need.
Luckily we can study some existing ERP systems you can get a hold of to see how they tackled some of the problems. There's a few good open source ERP's, and what brought this tip to my mind is a full cycle install of SAP Business One I oversaw (a small-mid size ERP that bypasses the challenges of the big SAP).
What you're looking for is seeing how others are solving the same ERP architecture you're facing. At the very least you'll get an idea of the tradeoffs between modularization, where to draw the line between modules and why.
Typically an ERP system handles everything from the quote, to production (if required), to billing, shipping, and the resulting accounting work all the way through out.
ERPS handle two main worlds:
Production of goods
Delivery of service
Some businesses are widget factories, others are service businesses. A full featured out of the box ERP will have one continuous chain/lifecycle of an "order" which gets serviced by a number of steps.
If we read a rough list of the steps an ERP can cover, you'll see the ones that apply to you. Those are probably the modules you have or should be breaking your app into. Imagine the following steps where each is a different document, all connected to the previous one in the chain.
Lead Generation --> Sales Opportunities
Sales Opportunities --> Quote/Estimate
Quote Estimate --> Sales Order
Sales Order --> Production Order (Build it, or schedule someone to do the work)
Production order --> Purchase orders (Order required materials or specialists to arrive when needed)
Production Order --> Production Scheduling (What will be built, when, or Who will get this done, when?)
Production Schedule --> Produce! (Do the work)
Produced Service/Good --> Inventory Adjustments - Convert any raw inventory to finished goods if needed, or get it ready to ship
Finished Good/Service --> Packing Slip
Packing Slip items --> Invoice
Where system integrators come in is using the steps required, and skipping over the ones that aren't used. This leads to one thing for your growing app:
Get a solid data security strategy in place. Make sure you're confortable that everyone can only see what they should. Assuming that is in place, it's a good idea to break apart the app into it's major sections. Modules are our friends. The order to break them up in, however, will likely have a larger effect on what you do than anything.
See which sections are general, (reporting, etc) that could be re-used between multiple apps, and which are more specialized to the application itself. The features that are tied to the application itself will likely be more tightly coupled already and you may have to work around that.
For an ERP, I have always preferred a transactional "core" module, which all the other transaction providers (billing pushing the process along once it is defined).
When I converted a Lotus Notes ERP from the 90's to the SAP ERP, the Lotus Notes app was excellent, it handled everything as it should. THere were some mini-apps built on the side that weren't integrated as modules which was the main reason to get rid of it.
If you re-wrote the app today, with today's requirements, how would you have done it differently? See if there's any major differences from what you have. Let the app fight for your attention to decide what needs overhauling / modularization first. ColdBox is wonderful for modularization, whether you're using plugin type modules or just using well separated code you won't go wrong with it, it's just a function of developer time and money available to get it done.
The first modules I'd build / automate unit testing on are the most complex programatically. Chances are if you're a decent dev, you don't need end to end unit testing as of yesterday. Start with the most complex, move onto the core parts of the app, and then spread into any other areas that may keep you up at night.
Hope that helped! Share what you end up doing if you don't mind, if anything I mentioned needs further explanation hit me up on here or twitter :)
#JasPanesar

Resources