I have the components of a trading robot. This follows the architecture of:
Data layer - streaming and formatting data
Model layer - updates model and writes events to event queue
Intelligence layer - fetch event, classify (buy, sell, null), filter, construct order (instrument, buy/sell, stop), write to order queue
Order layer - fetch event, choose size (or reject), place order, write order to DB
My question is:
What is the best design pattern to co-ordinate all components involved in each layer?
For a (simplified) example, I do not feel the following would be good practice:
Model M creates an instance of DataSource D
M creates an instance of Intelligence I
I creates an instance of Order O
The main point of the above being everything instantiates everything else, so nothing is operating independently (thus reduced redundancy).
But I also don't feel like one class which instantiates everything and manages the interactions is good practice.
Can anyone advise?
This is why people use IoC, it solves this problem. https://en.wikipedia.org/wiki/Inversion_of_control
Look at your language/framework stack, and search for an IoC library, it will likely fix most of your issues.
Related
I am trying to understand which of the following two options is the right approach and why.
Say we have GetHotelInfo(hotel_id) API that is being invoked from the Web till the Controller.
The logic of the GetHotelInfo is:
Invoke GetHotelPropertyData() (Location, facilities…)
Invoke GetHotelPrice(hotel_id, dates…)
Invoke GetHotelReviews(hotel_id)
Once all results come back, process and merge the data and return 1 object that contains all relevant data of the hotel.
Option 1:
Create 3 different repositories (HotelPropertyRepo, HotelPriceRepo,
HotelReviewsRepo)
Create GetHotelInfo usecase that will use these 3 repositories and
return the final result.
Option 2:
Create 3 different repositories (HotelPropertyRepo, HotelPriceRepo,
HotelReviewsRepo)
Create 3 different usecases (GetHotelPropertyDataUseCase,
GetHotelPriceUseCase, GetHotelReviewsUseCase)
Create GetHotelInfoUseCase that will orchestrate the previous 3
usecases. (It can also be a controller, but that’s a different topic)
Let’s say that right now only GetHotelInfo is being exposed to the Web but maybe in the future, I will expose some of the inner requests as well.
And would the answer be different if the actual logic of GetHotelInfo is not a combination of 3 endpoints but rather 10?
You can see a similar method (called Get()) in "Clean Architecture with GO" from Manato Kuroda
Manato points out that:
following Acyclic Dependencies Principle (ADP), the dependencies only point inward in the circle, not point outward and no circulation.
that Controller and Presenter are dependent on Use Case Input Port and Output Port which is defined as an interface, not as specific logic (the details). This is possible (without knowing the details in the outer layer) thanks to the Dependency Inversion Principle (DIP).
That is why, in example repository manakuro/golang-clean-architecture, Manato creates for the Use cases layer three directories:
repository,
presenter: in charge of Output Port
interactor: in charge of Input Port, with a set of methods of specific application business rules, depending on repository and presenter interface.
You can use that example, to adapt your case, with GetHotelInfo declared first in hotel_interactor.go file, and depending on specific business method declared in hotel_repository, and responses defined in hotel_presenter
Is expected Interactors (Use Case class) call other interactors. So, both approaches follow Clean Architecture principles.
But, the "maybe in the future" phrase goes against good design and architecture practices.
We can and should think the most abstract way so that we can favor reuse. But always keeping things simple and avoiding unnecessary complexity.
And would the answer be different if the actual logic of GetHotelInfo is not a combination of 3 endpoints but rather 10?
No, it would be the same. However, as you are designing APIs, in case you need the combination of dozens of endpoints, you should start considering put a GraphQL layer instead of adding complexity to the project.
Clean is not a well-defined term. Rather, you should be aiming to minimise the impact of change (adding or removing a service). And by "impact" I mean not only the cost and time factors but also the risk of introducing a regression (breaking a different part of the system that you're not meant to be touching).
To minimise the "impact of change" you would split these into separate services/bounded contexts and allow interaction only through events. The 'controller' would raise an event (on a shared bus) like 'hotel info request', and each separate service (property, price, and reviews) would respond independently and asynchronously (maybe on the same bus), leaving the controller to aggregate the results and return them to the client, which could be done after some period of time. If you code the result aggregator appropriately it would be possible to add new 'features' or remove existing ones completely independently of the others.
To improve on this you would then separate the read and write functionality of each context into its own context, each responding to appropriate events. This will allow you to optimise and scale the write function independently of the read function. We call this CQRS.
I am quite new in context of Micro-service architecture and reading this post : http://microservices.io/patterns/data/event-sourcing.html to get familiar with Event sourcing and data storage in Microservice architecture.
I have read many documents about 3 important aspect of system :
Using event sourcing instead of a simply shared DB and ORM and
row update
Events are JAVA objects.
In case of saving data permanently
, we need to use DB (either relational or noSQL)
Here are my questions :
How database comes along with event sourcing? I have read CQRS
pattern, but I can not understand how CQRS pattern is related to
event store and event objects ?
Can any body provide me a
complete picture and set of operations happens with all players to
gather: CQRS pattern , Event sourcing (including event storage
module) and finally different microservices?
In a system
composed of many microservices, should we have one event storage or
each microservice has its own ? or both possible ?
same
question about CQRS. This pattern is implemented in all
microservices or only in one ?
Finally, in case of using
microservice architecture, it is mandatory to have only one DB or
each Microserivce should have its own ?
As you can see, I have understood all small pieces of game , but I can not relate them together to compose a whole image. Specially relevance between CQRS and event sourcing and storing data in DB.
I read many articles for example :
https://ookami86.github.io/event-sourcing-in-practice/
https://msdn.microsoft.com/en-us/library/jj591577.aspx
But in all of them small players are discussed. Even a hand drawing piece of image will be appreciated.
How database comes along with event sourcing? I have read CQRS pattern, but I can not understand how CQRS pattern is related to event store and event objects ?
"Query" part of CQRS instructs you how to create a projection of events, which is applicable in some "bounded context", where the database could be used as a means to persist that projection. "Command" part allows you to isolate data transformation logic and decouple it from the "query" and "persistence" aspects of your app. To simply put it - you just project your event stream into the database in many ways (projection could be relational as well), depending on the task. In this model "query" and "command" have their own way of projecting and storing events data, optimised for the needs of that specific part of the application. Same data will be stored in events and in projections, this will allow achieving simplicity and loose coupling among subdomains (bounded contexts, microservices).
Can any body provide me a complete picture and set of operations happens with all players to gather: CQRS pattern , Event sourcing (including event storage module) and finally different microservices?
Have you seen Greg Young's attempt to provide simplest possible implementation? If you still confused, consider creating more specific question about his example.
In a system composed of many microservices, should we have one event storage or each microservice has its own ? or both possible ?
It is usually one common event storage, but there definitely could be some exceptions, edge cases where you really will need multiple storages for different microservices here and there. It all depends on the business case. If you not sure - most likely you just need a single storage for now.
same question about CQRS. This pattern is implemented in all microservices or only in one ?
It could be implemented in most performance-demanding microservices. It all depends on how complex your implementation becomes when you are introducing CQRS into it. If it gets simpler - why not implement it everywhere? But if people in your team become more and more confused by the need to perform more explicit synchronisation between commands and queries parts - maybe cqrs is too much for you. It all depends on your team, on your domain ... there is no single simple answer, unfortunately.
Finally, in case of using microservice architecture, it is mandatory to have only one DB or each Microservice should have its own ?
If same microservices sharing same tables - this is usually considered as an antipattern, as it increases coupling, the system becomes more fragile. You can still share the same database, but there should be no shared tables. Also, tables from one microservice better not have FK's to tables in another microservice. Same reason - to reduce coupling.
PS: consider not to ask coarse-grained questions, as it will be harder to get a response from people. Several smaller, more specific questions will have better chance to be answered.
I want to plan a solution that manages enriched data in my architecture.
To be more clear, I have dozens of micro services.
let's say - Country, Building, Floor, Worker.
All running over a separate NoSql data store.
When I get the data from the worker service I want to present also the floor name (the worker is working on), the building name and country name.
Solution1.
Client will query all microservices.
Problem - multiple requests and making the client be aware of the structure.
I know multiple requests shouldn't bother me but I believe that returning a json describing the entity in one single call is better.
Solution 2.
Create an orchestration that retrieves the data from multiple services.
Problem - if the data (entity names, for example) is not stored in the same document in the DB it is very hard to sort and filter by these fields.
Solution 3.
Before saving the entity, e.g. worker, call all the other services and fill the relative data (Building Name, Country name).
Problem - when the building name is changed, it doesn't reflect in the worker service.
solution 4.
(This is the best one I can come up with).
Create a process that subscribes to a broker and receives all entities change.
For each entity it updates all the relavent entities.
When an entity changes, let's say building name changes, it updates all the documents that hold the building name.
Problem:
Each service has to know what can be updated.
When a trailing update happens it shouldnt update the broker again (recursive update), so this can complicate to the microservices.
solution 5.
Keeping everything normalized. Fileter and sort in ElasticSearch.
Problem: keeping normalized data in ES is too expensive performance-wise
One thing I saw Netflix do (which i like) is create intermediary services for stuff like this. So maybe a new intermediary service that can call the other services to gather all the data then create the unified output with the Country, Building, Floor, Worker.
You can even go one step further and try to come up with a scheme for providing as input which resources you want to include in the output.
So I guess this closely matches your solution 2. I notice that you mention for solution 2 that there are concerns with sorting/filtering in the DB's. I think that if you are using NoSQL then it has to be for a reason, and more often then not the reason is for performance. I think if this was done wrong then yeah you will have problems but if all the appropriate fields that are searchable are properly keyed and indexed (as #Roman Susi mentioned in his bullet points 1 and 2) then I don't see this as being a problem. Yeah this service will only be as fast as the culmination of your other services and data stores, so they have to be fast.
Now you keep your individual microservices as they are, keep the client calling one service, and encapsulate the complexity of merging the data into this new service.
This is the video that I saw this in (https://www.youtube.com/watch?v=StCrm572aEs)... its a long video but very informative.
It is hard to advice on the Solution N level, but certain problems can be avoided by the following advices:
Use globally unique identifiers for entities. For example, by assigning key values some kind of URI.
The global ids also simplify updates, because you track what has actually changed, the name or the entity. (entity has one-to-one relation with global URI)
CAP theorem says you can choose only two from CAP. Do you want a CA architecture? Or CP? Or maybe AP? This will strongly affect the way you distribute data.
For "sort and filter" there is MapReduce approach, which can distribute the load of figuring out those things.
Think carefully about the balance of normalization / denormalization. If your services operate on URIs, you can have a service which turns URIs to labels (names, descriptions, etc), but you do not need to keep the redundant information everywhere and update it. Do not do preliminary optimization, but try to keep data normalized as long as possible. This way, worker may not even need the building name but it's global id. And the microservice looks up the metadata from another microservice.
In other words, minimize the number of keys, shared between services, as part of separation of concerns.
Focus on the underlying model, not the JSON to and from. Right modelling of the data in your system(s) gains you more than saving JSON calls.
As for NoSQL, take a look at Riak database: it has adjustable CAP properties, IIRC. Even if you do not use it as such, reading it's documentation may help to come up with suitable architecture for your distributed microservices system. (Of course, this applies if you have essentially parallel system)
First of all, thanks for your question. It is similar to Main Problem Of Document DBs: how to sort collection by field from another collection? I have my own answer for that so i'll try to comment all your solutions:
Solution 1: It is good if client wants to work with Countries/Building/Floors independently. But, it does not solve problem you mentioned in Solution 2 - sorting 10k workers by building gonna be slow
Solution 2: Similar to Solution 1 if all client wants is a list enriched workers without knowing how to combine it from multiple pieces
Solution 3: As you said, unacceptable because of inconsistent data.
Solution 4: Gonna be working, most of the time. But:
Huge data duplication. If you have 20 entities, you are going to have x20 data.
Large complexity. 20 entities -> 20 different procedures to update related data
High cohesion. All your services must know each other. Data model change will propagate to every service because of update procedures
Questionable eventual consistency. It can be done so data will be consistent after failures but it is not going to be easy
Solution 5: Kind of answer :-)
But - you do not want everything. Keep separated services that serve separated entities and build other services on top of them.
If client wants enriched data - build service that returns enriched data, as in Solution 2.
If client wants to display list of enriched data with filtering and sorting - build a service that provides enriched data with filtering and sorting capability! Likely, implementation of such service will contain ES instance that contains cached and indexed data from lower-level services. Point here is that ES does not have to contain everything or be shared between every service - it is up to you to decide better balance between performance and infrastructure resources.
This is a case where Linked Data can help you.
Basically the Floor attribute for the worker would be an URI (a link) to the floor itself. And Any other linked data should be expressed as URIs as well.
Modeled with some JSON-LD it would look like this:
worker = {
'#id': '/workers/87373',
name: 'John',
floor: {
'#id': '/floors/123'
}
}
floor = {
'#id': '/floor/123',
'level': 12,
building: { '#id': '/buildings/87' }
}
building = {
'#id': '/buildings/87',
name: 'John's home',
city: { '#id': '/cities/908' }
}
This way all the client has to do is append the BASE URL (like api.example.com) to the #id and make a simple GET call.
To remove the extra calls burden from the client (in case it's a slow mobile device), we use the gateway pattern with micro-services. The gateway can expand those links with very little effort and augment the return object. It can also do multiple calls in parallel.
So the gateway will make a GET /floor/123 call and replace the floor object on the worker with the reply.
1) What are the BLL-services? What's the difference between them and Service Layer services? What goes to domain services and what goes to service layer?
2) Howcome I refactor BBL model to give it a behavior: Post entity holds a collection of feedbacks which already makes it possible to add another Feedback thru feedbacks.Add(feedback). Obviosly there are no calculations in a plain blog application. Should I define a method to add a Feedback inside Post entity? Or should that behavior be mantained by a corresponing service?
3) Should I use Unit-Of-Work (and UnitOfWork-Repositories) pattern like it's described in http://www.amazon.com/Professional-ASP-NET-Design-Patterns-Millett/dp/0470292784 or it would be enough to use NHibernate ISession?
1) Business Layer and Service Layer are actually synonyms. The 'official' DDD term is an Application Layer.
The role of an Application Layer is to coordinate work between Domain Services and the Domain Model. This could mean for example that an Application function first loads an entity trough a Repository and then calls a method on the entity that will do the actual work.
2) Sometimes when your application is mostly data-driven, building a full featured Domain Model can seem like overkill. However, in my opinion, when you get used to a Domain Model it's the only way you want to go.
In the Post and Feedback case, you want an AddFeedback(Feedback) method from the beginning because it leads to less coupling (you don't have to know if the FeedBack items are stored in a List or in a Hashtable for example) and it will offer you a nice extension point. What if you ever want to add a check that no more then 10 Feedback items are allowed. If you have an AddFeedback method, you can easily add the check in one single point.
3) The UnitOfWork and Repository pattern are a fundamental part of DDD. I'm no NHibernate expert but it's always a good idea to hide infrastructure specific details behind an interface. This will reduce coupling and improves testability.
I suggest you first read the DDD book or its short version to get a basic comprehension of the building blocks of DDD. There's no such thing as a BLL-Service or a Service layer Service. In DDD you've got
the Domain layer (the heart of your software where the domain objects reside)
the Application layer (orchestrates your application)
the Infrastructure layer (for persistence, message sending...)
the Presentation layer.
There can be Services in all these layers. A Service is just there to provide behaviour to a number of other objects, it has no state. For instance, a Domain layer Service is where you'd put cohesive business behaviour that does not belong in any particular domain entity and/or is required by many other objects. The inputs and ouputs of the operations it provides would typically be domain objects.
Anyway, whenever an operation seems to fit perfectly into an entity from a domain perspective (such as adding feedback to a post, which translates into Post.AddFeedback() or Post.Feedbacks.Add()), I always go for that rather than adding a Service that would only scatter the behaviour in different places and gradually lead to an anemic domain model. There can be exceptions, like when adding feedback to a post requires making connections between many different objects, but that is obviously not the case here.
You don't need a unit-of-work pattern on top on the NHibernate session:
Why would I use the Unit of Work pattern on top of an NHibernate session?
Using Unit of Work design pattern / NHibernate Sessions in an MVVM WPF
I have an application with about 20 models and controllers and am not using any particular framework. What is the best practice for using multiple remote objects in Flex performance-wise?
1) Method 1 - One per Component - Each component instantiates a RemoteObject for itself
2) Method 2 - Multiple in Application Root - Each controller is handled by a RemoteObject in the root
3) Method 3 - One in Application Root - Combine all controllers into one class and handle them with one RemoteObject
I'm guessing 3 will have the best performance but will be too messy to maintain and 1 would be the cleanest but would take a performance hit. What do you think?
Best practice would be "none of the above." Your Views should dispatch events that a controller or Command component would use to call your service(s) and then update your model on return of the data. Your Views would be bound to the data, and then the Views would automatically be updated with the new data.
My preference is to have one service Class per different piece or type of data I am retrieving--this makes it easier to build mock services that can be swapped for real services as needed depending on what you're doing (for instance if you have a complicated server setup, a developer who is working on skinning would use the mocks). But really, how you do that is a matter of personal preference.
So, where do your services live, so that a controller or command can reach them? If you use a Dependency Injection framework such as Robotlegs or Swiz, it will have a separate object that handles instantiating, storing, and and returning instances of model and service objects (in the case of Robotlegs, it also will create your Command objects for you and can create view management objects called Mediators). If you don't use one of these frameworks, you'll need to "roll your own," which can be a bit difficult if you're not architecturally minded.
One thing people who don't know how to roll their own (such as the people who wrote the older versions of Cairngorm) tend to fall back on is Singletons. These are not considered good practice in this day and age, especially if you are at all interested in unit testing your work. http://misko.hevery.com/code-reviewers-guide/flaw-brittle-global-state-singletons/
A lot depends on how much data you have, how many times it gets refreshed from the server, and of you have to support update as well as query.
Number 3 (and 2) are basically a singletons - which tends to work best for large applications and large datasets. Yes, it would be complex to maintain yourself, but that's why people tend to use frameworks (puremvc, cairgorm, etc). much of the complexity is handled for you. Caching data within the frameworks also enhances performance and response time.
The problem with 1 is if you have to coordinate data updates per component, you basically need to write a stateless UI, always retrieving the data from the server on each component visibility.
edit: I'm using cairgorm - have ~ 30 domain models (200 or so remote calls) and also use view models. some of my models (remote object) have 10's of thousands of object instances (records), I keep a cache with/write back. All of the complexity is encapsulated in the controller/commands. Performance is acceptable.
In terms of pure performance, all three of those should perform roughly the same. You'll of course use slightly more memory by having more instances of RemoteObject and there are a couple of extra bytes that get sent along with the first request that you've made with a given RemoteObject instance to your server (part of the AMF protocol). However, the effect of these things is negligible. As such, Amy is right that you should make a choice based on ease of maintainability and not performance.