I am building a microservice that will have a management layer, and another corresponding microservice for a public api. The management layer's client will be imported into the api service as well as a few other apps. What I am wondering is what the proper way to structure the mgmt layer's calls would be. I'm using rpc.
For example, let's say the service contains Human, Dog, Food. Food has to be owned by a Human OR Dog, not both and not neither.
Now since the mgmt service will only be accessed by client and not url, would it be better to define the specs like:
POST: /humans/id/food Client call: mgmtService.createFoodForHuman(humanId, Food)
POST: /dogs/id/food Client call: mgmtService.createFoodForDog(dogId, Food)
or
POST: /food Client call: mgmtService.createFood(Food)
where the second instance of Food will require the user to pass in one of human id or dog id.
To me, the first example seems more straightforward from a code-client perspective since the methods are more defined and the second example doesn't give any info of what kind of id is needed, but I am interested if there is any thoughts on which one is best practice. Thanks!
It is actually depends, on larger scale you would want to apply the first method. Sometimes, when you works with the other teams to build a complex services, it is possible that the other teams has to fix your bug because of your absent or any case where you can't fix the bug that time, if the bug is really an urgent cases, you wouldn't want it becomes a blocker because they did not understand how your code works.
Well, you can always provide a documentation for every API details on your platform such as Confluence, but still that would takes time to look for it.
On the other hand, with second method you can achieve that one for all API.
TL;DR -> Large scale 1st method, small scale 2nd method.
Related
My app (Laravel/Vue) has some API endpoints like /api/price/{id} and /api/buy/{id} first endpoint gives the price of a product and second buy x quantity (passed by POST) of that product.
Prices have too much volatility (think things like cryptocurrencies), so the /price/ endpoint will do some calcs and give the realtime price. When I click the buy button on the browser it goes to the /buy/ endpoint, of course. The first thing it has to do is to calculate the real time price, just as the /price/ endpoint does, then write to the database etc.
For some reason beyond my control the endpoints are in different controllers, I have a PriceController and a BuyController, so I have several options to make this work:
I write the same code in both endpoints. Of course, I don't like a bit this solution, because of maintenance
(Which I'm currently using) Inside my BuyController I do an API call to PriceController, and yeah, sure it works, but I find it quite strange to go to the network to access something local, I don't know if it will have any performance impact under load (in development, it doesn't seem to be any)
Should I do a helper function to do all the calcs needed and use it as the price source for both /price/ and /buy/ endpoints? I feel this approach might be better. I normally don't use helpers so I can't start imagining what name to give to the helper function.
What should be the "correct" approach?
Maybe the correct way is to "unify" /price/ and /buy/ endpoints and have a function inside the resulting controller, but I don't know how many changes should I make to the rest of the app.
Probably you should write a Service and use it in both controllers (if you move your controller's logic to service even better - ), but if you are using 'calculation logic' frequently, then maybe using a helper function would be more suitable.
if you are not comfortable using services, you can also use this approach
I'm providing RESTful API to my (JS) client from (Java Spring) server.
Main site page contains a number of logical blocks (news, last comments, some trending stuff), each of them has a corresponding entity on server. Which way is a right one to go, handle one request like
/api/main_page/ ->
{
news: {...}
comments: {...}
...
}
or let the client do a few requests like
/api/news/
/api/comments/
...
I know in general it's better to have one large request/response, but is this an answer to this situation as well?
Ideally, you should have different API calls for fetching individual configurable content blocks of the page from the same API.
This way your content blocks are loosely bounded to each other.
You
can extend, port(to a new framework) and modify them independently at
anytime you want.
This comes extremely useful when application grows.
Switching off a feature is fairly easy in this
case.
A/B testing is also easy in this case.
Writing automation is
also very easy.
Overall it helps in reducing the testing efforts.
But if you really want to fetch this in one call. Then you should add additional params in request and when the server sees that additional param it adds the additional independent JSON in the response by calling it's own method from BL layer.
And, if speed is your concern then try caching these calls on server for some time(depends on the type of application).
I think in general multiple requests can be justified, when the requested resources reflect parts of the system state. (my personal rule of thumb, still WIP).
i.e. if a news gets displayed in your client application a lot, I would request it once and reuse it wherever I can. If you aggregate here, you would need to request for it later, maybe some of them never get actually displayed, and you have some magic to do if the representation of a news differs in the aggregation and /news/{id}-resource.
This approach would increase communication if the page gets loaded for the first time, but decrease communication throughout your client application the longer it runs.
The state on the server gets copied request by request to your client or updated when needed (Etags, last-modified, etc.).
In your example it looks like /news and /comments are some sort of latest or since last visit, but not all.
If this is true, I would design them to be a resurce as well, like /comments/latest or similar.
But in any case I would them only have self-links to the /news/{id} or /comments/{id} respectively. Then you would have a request to /comments/latest, what results in a list of news-self-links, for what I would start a request only if I don't already have that news (maybe I want to check if the cached copy is still up to date).
It is also possible to trigger the request to a /news/{id} only if it gets actually displayed (scrolling, swiping).
Probably the lifespan of a news or a comment is a criterion to answer this question. Meaning the caching in the client it is not that vital to the system, in opposite of a book in an Book store app.
I'll illustrate my question with Twitter. For example, Twitter has microservice-based architecture which means that different processes are in different servers and have different databases.
A new tweet appears, server A stored in its own database some data, generated new events and fired them. Server B and C didn't get these events at this point and didn't store anything in their databases nor processed anything.
The user that created the tweet wants to edit that tweet. To achieve that, all three services A, B, C should have processed all events and stored to db all required data, but service B and C aren't consistent yet. That means that we are not able to provide edit functionality at the moment.
As I can see, a possible workaround could be in switching to immediate consistency, but that will take away all microservice-based architecture benefits and probably could cause problems with tight coupling.
Another workaround is to restrict user's actions for some time till data aren't consistent across all necessary services. Probably a solution, depends on customer and his business requirements.
And another workaround is to add additional logic or probably service D that will store edits as user's actions and apply them to data only when they will be consistent. Drawback is very increased complexity of the system.
And there are two-phase commits, but that's 1) not really reliable 2) slow.
I think slowness is a huge drawback in case of such loads as Twitter has. But probably it could be solved, whereas lack of reliability cannot, again, without increased complexity of a solution.
So, the questions are:
Are there any nice solutions to the illustrated situation or only things that I mentioned as workarounds? Maybe some programming platforms or databases?
Do I misunderstood something and some of workarounds aren't correct?
Is there any other approach except Eventual Consistency that will guarantee that all data will be stored and all necessary actions will be executed by other services?
Why Eventual Consistency has been picked for this use case? As I can see, right now it is the only way to guarantee that some data will be stored or some action will be performed if we are talking about event-driven approach when some of services will start their work when some event is fired, and following my example, that event would be “tweet is created”. So, in case if services B and C go down, I need to be able to perform action successfully when they will be up again.
Things I would like to achieve are: reliability, ability to bear high loads, adequate complexity of solution. Any links on any related subjects will be very much appreciated.
If there are natural limitations of this approach and what I want cannot be achieved using this paradigm, it is okay too. I just need to know that this problem really isn't solved yet.
It is all about tradeoffs. With eventual consistency in your example it may mean that the user cannot edit for maybe a few seconds since most of the eventual consistent technologies would not take too long to replicate the data across nodes. So in this use case it is absolutely acceptable since users are pretty slow in their actions.
For example :
MongoDB is consistent by default: reads and writes are issued to the
primary member of a replica set. Applications can optionally read from
secondary replicas, where data is eventually consistent by default.
from official MongoDB FAQ
Another alternative that is getting more popular is to use a streaming platform such as Apache Kafka where it is up to your architecture design how fast the stream consumer will process the data (for eventual consistency). Since the stream platform is very fast it is mostly only up to the speed of your stream processor to make the data available at the right place. So we are talking about milliseconds and not even seconds in most cases.
The key thing in these sorts of architectures is to have each service be autonomous when it comes to writes: it can take the write even if none of the other application-level services are up.
So in the example of a twitter like service, you would model it as
Service A manages the content of a post
So when a user makes a post, a write happens in Service A's DB and from that instant the post can be edited because editing is just a request to A.
If there's some other service that consumes the "post content" change events from A and after a "new post" event exposes some functionality, that functionality isn't going to be exposed until that service sees the event (yay tautologies). But that's just physics: the sun could have gone supernova five minutes ago and we can't take any action (not that we could have) until we "see the light".
I want to plan a solution that manages enriched data in my architecture.
To be more clear, I have dozens of micro services.
let's say - Country, Building, Floor, Worker.
All running over a separate NoSql data store.
When I get the data from the worker service I want to present also the floor name (the worker is working on), the building name and country name.
Solution1.
Client will query all microservices.
Problem - multiple requests and making the client be aware of the structure.
I know multiple requests shouldn't bother me but I believe that returning a json describing the entity in one single call is better.
Solution 2.
Create an orchestration that retrieves the data from multiple services.
Problem - if the data (entity names, for example) is not stored in the same document in the DB it is very hard to sort and filter by these fields.
Solution 3.
Before saving the entity, e.g. worker, call all the other services and fill the relative data (Building Name, Country name).
Problem - when the building name is changed, it doesn't reflect in the worker service.
solution 4.
(This is the best one I can come up with).
Create a process that subscribes to a broker and receives all entities change.
For each entity it updates all the relavent entities.
When an entity changes, let's say building name changes, it updates all the documents that hold the building name.
Problem:
Each service has to know what can be updated.
When a trailing update happens it shouldnt update the broker again (recursive update), so this can complicate to the microservices.
solution 5.
Keeping everything normalized. Fileter and sort in ElasticSearch.
Problem: keeping normalized data in ES is too expensive performance-wise
One thing I saw Netflix do (which i like) is create intermediary services for stuff like this. So maybe a new intermediary service that can call the other services to gather all the data then create the unified output with the Country, Building, Floor, Worker.
You can even go one step further and try to come up with a scheme for providing as input which resources you want to include in the output.
So I guess this closely matches your solution 2. I notice that you mention for solution 2 that there are concerns with sorting/filtering in the DB's. I think that if you are using NoSQL then it has to be for a reason, and more often then not the reason is for performance. I think if this was done wrong then yeah you will have problems but if all the appropriate fields that are searchable are properly keyed and indexed (as #Roman Susi mentioned in his bullet points 1 and 2) then I don't see this as being a problem. Yeah this service will only be as fast as the culmination of your other services and data stores, so they have to be fast.
Now you keep your individual microservices as they are, keep the client calling one service, and encapsulate the complexity of merging the data into this new service.
This is the video that I saw this in (https://www.youtube.com/watch?v=StCrm572aEs)... its a long video but very informative.
It is hard to advice on the Solution N level, but certain problems can be avoided by the following advices:
Use globally unique identifiers for entities. For example, by assigning key values some kind of URI.
The global ids also simplify updates, because you track what has actually changed, the name or the entity. (entity has one-to-one relation with global URI)
CAP theorem says you can choose only two from CAP. Do you want a CA architecture? Or CP? Or maybe AP? This will strongly affect the way you distribute data.
For "sort and filter" there is MapReduce approach, which can distribute the load of figuring out those things.
Think carefully about the balance of normalization / denormalization. If your services operate on URIs, you can have a service which turns URIs to labels (names, descriptions, etc), but you do not need to keep the redundant information everywhere and update it. Do not do preliminary optimization, but try to keep data normalized as long as possible. This way, worker may not even need the building name but it's global id. And the microservice looks up the metadata from another microservice.
In other words, minimize the number of keys, shared between services, as part of separation of concerns.
Focus on the underlying model, not the JSON to and from. Right modelling of the data in your system(s) gains you more than saving JSON calls.
As for NoSQL, take a look at Riak database: it has adjustable CAP properties, IIRC. Even if you do not use it as such, reading it's documentation may help to come up with suitable architecture for your distributed microservices system. (Of course, this applies if you have essentially parallel system)
First of all, thanks for your question. It is similar to Main Problem Of Document DBs: how to sort collection by field from another collection? I have my own answer for that so i'll try to comment all your solutions:
Solution 1: It is good if client wants to work with Countries/Building/Floors independently. But, it does not solve problem you mentioned in Solution 2 - sorting 10k workers by building gonna be slow
Solution 2: Similar to Solution 1 if all client wants is a list enriched workers without knowing how to combine it from multiple pieces
Solution 3: As you said, unacceptable because of inconsistent data.
Solution 4: Gonna be working, most of the time. But:
Huge data duplication. If you have 20 entities, you are going to have x20 data.
Large complexity. 20 entities -> 20 different procedures to update related data
High cohesion. All your services must know each other. Data model change will propagate to every service because of update procedures
Questionable eventual consistency. It can be done so data will be consistent after failures but it is not going to be easy
Solution 5: Kind of answer :-)
But - you do not want everything. Keep separated services that serve separated entities and build other services on top of them.
If client wants enriched data - build service that returns enriched data, as in Solution 2.
If client wants to display list of enriched data with filtering and sorting - build a service that provides enriched data with filtering and sorting capability! Likely, implementation of such service will contain ES instance that contains cached and indexed data from lower-level services. Point here is that ES does not have to contain everything or be shared between every service - it is up to you to decide better balance between performance and infrastructure resources.
This is a case where Linked Data can help you.
Basically the Floor attribute for the worker would be an URI (a link) to the floor itself. And Any other linked data should be expressed as URIs as well.
Modeled with some JSON-LD it would look like this:
worker = {
'#id': '/workers/87373',
name: 'John',
floor: {
'#id': '/floors/123'
}
}
floor = {
'#id': '/floor/123',
'level': 12,
building: { '#id': '/buildings/87' }
}
building = {
'#id': '/buildings/87',
name: 'John's home',
city: { '#id': '/cities/908' }
}
This way all the client has to do is append the BASE URL (like api.example.com) to the #id and make a simple GET call.
To remove the extra calls burden from the client (in case it's a slow mobile device), we use the gateway pattern with micro-services. The gateway can expand those links with very little effort and augment the return object. It can also do multiple calls in parallel.
So the gateway will make a GET /floor/123 call and replace the floor object on the worker with the reply.
1) What are the BLL-services? What's the difference between them and Service Layer services? What goes to domain services and what goes to service layer?
2) Howcome I refactor BBL model to give it a behavior: Post entity holds a collection of feedbacks which already makes it possible to add another Feedback thru feedbacks.Add(feedback). Obviosly there are no calculations in a plain blog application. Should I define a method to add a Feedback inside Post entity? Or should that behavior be mantained by a corresponing service?
3) Should I use Unit-Of-Work (and UnitOfWork-Repositories) pattern like it's described in http://www.amazon.com/Professional-ASP-NET-Design-Patterns-Millett/dp/0470292784 or it would be enough to use NHibernate ISession?
1) Business Layer and Service Layer are actually synonyms. The 'official' DDD term is an Application Layer.
The role of an Application Layer is to coordinate work between Domain Services and the Domain Model. This could mean for example that an Application function first loads an entity trough a Repository and then calls a method on the entity that will do the actual work.
2) Sometimes when your application is mostly data-driven, building a full featured Domain Model can seem like overkill. However, in my opinion, when you get used to a Domain Model it's the only way you want to go.
In the Post and Feedback case, you want an AddFeedback(Feedback) method from the beginning because it leads to less coupling (you don't have to know if the FeedBack items are stored in a List or in a Hashtable for example) and it will offer you a nice extension point. What if you ever want to add a check that no more then 10 Feedback items are allowed. If you have an AddFeedback method, you can easily add the check in one single point.
3) The UnitOfWork and Repository pattern are a fundamental part of DDD. I'm no NHibernate expert but it's always a good idea to hide infrastructure specific details behind an interface. This will reduce coupling and improves testability.
I suggest you first read the DDD book or its short version to get a basic comprehension of the building blocks of DDD. There's no such thing as a BLL-Service or a Service layer Service. In DDD you've got
the Domain layer (the heart of your software where the domain objects reside)
the Application layer (orchestrates your application)
the Infrastructure layer (for persistence, message sending...)
the Presentation layer.
There can be Services in all these layers. A Service is just there to provide behaviour to a number of other objects, it has no state. For instance, a Domain layer Service is where you'd put cohesive business behaviour that does not belong in any particular domain entity and/or is required by many other objects. The inputs and ouputs of the operations it provides would typically be domain objects.
Anyway, whenever an operation seems to fit perfectly into an entity from a domain perspective (such as adding feedback to a post, which translates into Post.AddFeedback() or Post.Feedbacks.Add()), I always go for that rather than adding a Service that would only scatter the behaviour in different places and gradually lead to an anemic domain model. There can be exceptions, like when adding feedback to a post requires making connections between many different objects, but that is obviously not the case here.
You don't need a unit-of-work pattern on top on the NHibernate session:
Why would I use the Unit of Work pattern on top of an NHibernate session?
Using Unit of Work design pattern / NHibernate Sessions in an MVVM WPF