One of the benefits of Microservice architecture is one can scale heavily used parts of the application without scaling the other parts. This supposedly provides benefits around cost.
However, my question is, if a heavily used microservice is dependent on other microservice to do it's work wouldn't you have to scale the other services as well seemingly defeating the purpose. If a microservice is calling other micro service at real time to do it's job, does it mean that Micro service boundaries are not established correctly.
There's no rule of thumb for that.
Scaling usually depends on some metrics and when some thresholds are reached then new instances are created. Same goes for the case when they are not needed anymore.
Some services are doing simple, fast tasks, like taking an input and writing it to the database and others may be longer running task which can take any amount of time.
If a service that needs scale is calling a service that can easily handle heavy loads in a reliable way then there is no need to scale that service.
That idea behind scaling is to scale up when needed in order to support the loads and then scale down whenever loads get in the regular metrics ranges in order to reduce the costs.
There are two topics to discuss here.
First is that usually, it is not a good practice to communicate synchronously two microservices because you are coupling them in time, I mean, one service has to wait for the other to finish its task. So normally it is a better approach to use some message queue to decouple the producer and consumer, this way the load of one service doesn't affect the other.
However, there are situations in which it is necessary to do synchronous communication between two services, but it doesn't mean necessarily that both have to scale the same way, for example: if a service has to make several calls to other services, queries to database, or other kind of heavy computational tasks, and one of the service called only do an array sorting, probably the first service has to scale much more than the second in order to process the same number of request because the threads in the first service will be occupied longer time than the second
Related
In a world of microservices, often one microservice needs to invoke another, synchronous or asynchronous way.
In the case of synchronous way of communication, I have understood that it affects the availbility of services, as both services need to be available during calls.
To minimize this synchronous way of communication, one possible solution is to have DATA REPLICATION at client service. The client service also up-to-date data by listening to events published by services.
According to me, this is not a good choice as we are duplicating data and it might become stale and also database overhead.
what will be the best suitable scenario when the above pattern will be the best suit?
Microservices are distributed systems. This means that they are constrained by the CAP theorem, which basically means you have a choice between:
Sacrifice availability to preserve consistency: this would (among other things) lead to one service invoking functionality in another in a synchronous way. If the other service is unavailable, so is all functionality in this service which depends on that service's functionality.
Sacrifice consistency to preserve availability: you build services to be autonomous and not depend on other services being up. This leads in fairly short order to services not sharing databases and to asynchronous replication of data (because if service A has synchronously replicated data from service B, then service B being down doesn't affect A's availability, but A being down affects B's availability): with asynchronous replication, the best you can hope for is eventual consistency.
The choice between those two (if you happen to have the ability to freeze the entire universe if there's a network partition, you might be able to sacrifice partition tolerance for consistency and availability) is ultimately a business question (it's worth noting that there's a continuum of approaches between those extremes). How much are you spending on storage and on designing an (arguably) more complex system vs. how much are you losing by being unavailable?
It should be noted that the universe is inherently eventually consistent: the sun could have gone supernova a few minutes ago and we can't know it for a few minutes more.
As for the concern about duplicated data: chances are the data is already duplicated (backups) and in any database worth using the data is duplicated (the write-ahead log).
As for situations, it's a lot harder to think of a situation where aiming for strong consistency is strictly the most suitable option.
But for an example, consider a chain of coffee shops. We have a cash register service and we have a loyalty/rewards service. Data from the loyalty/rewards service is needed by the cash register (if a customer is redeeming a "50% off a latte" reward you'd want the register to know that it's valid), and every transaction (at least those with a loyalty ID) at the register should be known by the rewards service.
If we want the reward redemptions to be consistent, then it implies that if the loyalty/rewards service is inaccessible from the register, no rewards can be redeemed. There's a nonzero chance that a customer who can't redeem a reward just walks out (and a further nonzero chance that they never get coffee from you again).
Conversely, if we want both services to have a consistent view then we're demanding that if the power's out at any store we can't determine new rewards, or if the loyalty/rewards service is inaccessible from the register, no new sales can be made.
The solution is for both services to maintain the data they need to function, even if another service controls updates to that data. They'll eventually catch up. In the case of reward redemption, assuming the unavailability happens rarely enough, it may even be desirable to have the cash register perform a preliminary validation and if that passes, assume that the reward is valid and submit it later to the loyalty/reward service.
I am trying to learn architecting an business application adhering microservices fundamentals and its considerations. I have come across a question to which I am bit confused.
In a microservice architecture having multiple microservices with their own DB if data needs to be shared among each others then what should be the proffered way, service bus or calling them via HttpClient ?
I know that with message queue through service bus whenever a message is needed to be shared with others one micro service can publish this message and all subscriber then can retrieve the same, but in this case if that information needs to be stored in other microservice application's DB too, would that not become the redundant data?
So isn't enough to read the data simply via HttpClient whenever needed.
Looking forward to see the replies, thanks for the help in advance.
It depends upon the other factor like latency, redundancy and availability. Both options works keeping redundant data or REST call whenever we need data.
Points that work against direct HTTP Clients calls are -
It impact availability. It reduce overall availability if the system.
It impact performance and latency. Support there is an operation from service A that need data from service B. Frequency of the operation is very high. In that case, it reduce performance and increase latency as well as response time.
It doesn't support JOINs. So, you have to manipulate data. That also impact performance.
Points that work against message bus approach/event driven -
Duplicate data - So, increase complexity of the system to keep the same in sync.
It reduce consistency of the system. Now, system is eventual consistent.
In system design, no option is incorrect. All options have some pros and some cons so choose wisely according to your requirement and system.
Assuming
I am building a stateless micro-service which exposes a few simple API endpoints (e.g., a read-through cache, save an entry to database),
I am using the non-blocking database clients e.g., mysql or redis and
I always want my microservices to speak to each other via HTTP (by placing EC2 instances behind a load balancer)
Questions
When will I want to use more than 1 standard verticles (i.e., write the whole microservice as a single verticle and deploy n instances of it (n = num of event-loop threads))?. Won't adding more verticles only add to serialization and context-switching costs?
Lets say I split the microservice into multiple standard verticles (for whatever reason). Wouldn't deploying n (n = num of event-loop threads) instances of each always give better performance than deploying a different ratio of instances. As each verticle is just a listener on an address and it will mean every event-loop thread can handle all kinds of messages and they are load balanced already.
When will I want to run my application in cluster mode? Based on docs, I get the feeling that cluster mode makes sense only when you have multiple verticles and that too when you have an actual use-case for clustering e.g., different EC2 instances handle requests for different users to help with data locality (say using ignite)
P.S., please help if even if you can answer one of the above questions.
I always want my microservices to speak to each other via HTTP (by placing EC2 instances behind a load balancer)
It doesn't make much sence to use Vertx if you already went for this overcomplicated approach.
Vertx is using Event Bus for in-cluster communication, eliminating the need for HTTP as well as LB in front.
Answers:
Why should it? If verticles are not talking to each other, where the serialization overhead should occur?
If your verticles are using non-blocking calls (and thus are multithreded), you won't see any difference between 1 or N instances on the same machine. Also if your verticle starts a (HTTP) server over a certain port, then the all instances will share that single server accross all thread (vertx is doing some magic reroutings here)
Cluster mode is the thing which I mentioned in the beginning. This is the proper way to distribute and scale you microservices.
A verticle is a way to structure your code. So, you'd want verticle of another type probably when your main verticle grows too big. How big? That depends on your preferences. I try to keep them rather small, about 200 LOC at the most, to do one thing.
Not necessarily. Different verticles can perform very different tasks, at different paces. Having N instances of all of them is not necessarily bad, but rather redundant.
Probably never. Clustered mode was a thing before microservices. Using it adds another level of complexity (cluster manager, for example Hazelcast), and it also means you cannot be as polyglot.
I have some computation intensive and long-running task. It can easily be split into sub-tasks and also it would be kind of easy to aggregate the results later on. For example Map/Reduce would work well.
I have to solve this on Cloud Foundry and there I want to get advantage from autos-caling, that is creation of additional instances due to high CPU loads. Normally I use Spring boot for developing my cf apps.
Any ideas are welcome of how to divide&conquer in an elastic way on cf. It would be great to have as many instances created as cf would do, without needing to configure the amount of available application instances in the application. Also I need to trigger the creation of instances by loading the CPUs to provoke auto-scaling.
I have to solve this on Cloud Foundry
It sounds like you're on the right track here. The main thing is that you need to write your app so that it can coexist with multiple instances of itself (or perhaps break it into a primary node that coordinates work and multiple worker apps). However you architect the app, being able to scale up instances is critical. You can then simply cf scale to add or remove nodes and increase capacity.
If you wanted to get clever, you could set up a pipeline to run your jobs. Step one would be to scale up the worker nodes of your app, step two would be to schedule the work to run, step three would be to clean up and scale down your nodes.
I'm suggesting this because manual scaling is going to be the simplest path forward (please read on for why).
and there I want to get advantage from autos-caling, that is creation of additional instances due to high CPU loads.
As to autoscaling, I think it's possible but I also think it's making the problem more complicated than it needs to be. Auto scaling by CPU on Cloud Foundry is not as simple as it seems. The way Linux reports CPU usage, you can exceed 100%, it's 100% per CPU core. Pair this with the fact that you may not know how many CPU cores are on your Cells (like if you're using a public CF provider), the fact that the number of cores could change over time (if your provider changes hardware), and that makes it's difficult to know at what point you should scale your application.
If you must autoscale, I would suggest trying to autoscale on some other metric. What metrics are available, will depend on the autoscaler tool you are using. The best would be if you could have some custom metric, then you could use work queue length or something that's relevant to your application. If custom metrics are not supported, you could always hack together your own autoscaler that does work with metrics relevant to your application (you can scale up and down by adjusting the instance cound of your app using the CF API).
You might also be able to hack together a solution based on the metrics that your autoscaler does provide. For example, you could artificially inflate a metric that your autoscaler does support in proportion to the workload you need to process.
You could also just scale up when your work day starts and scale down at the end of the day. It's not dynamic, but it simple and it will get you some efficiency improvements.
Hope that helps!
I want to know the applicability of the Akka Actor model.
I know it is useful in the case a huge number of Actor instances are created and destroyed. e.g. a call server, where every incoming call creates an actor instance and communicates with few other actors and get killed after the call is over.
Is it also useful in the following scenario :
A server has a few processing elements (10~50) implemented over Actors. The lifetime of these processing elements is infinite. some of them do not maintain state and a few maintain state. The processing elements process the message and pass the message to other actors in a fixed manner. The system receives a huge number of messages from outside and gets passed through processing elements and goes out of the system.
My gut feeling is that we cannot get any advantage by using Akka Actor model and even implementing this server in Scala. Because the use case for which Akka is designed, is not applicable here. If the scale-up meant that processing elements be increased dynamically then it would be applicable.
For fixed topologies, I think if i implement it in Java, it is going to be more beneficial in terms of raw performance. The 'immutability' feature of Scala leads to more copies and so reduces performance. So i believe i better stick to Java.
Is my understanding correct? I a nut shell i want to know why i should leave Java and use Scala/Akka for the application scenario above. and my target is to process 1 million messages per second.
If this question is still actual...
Scala vs. Java
Scala gives productivity to developers.
Immutability decreases debugging to almost zero level.
GC perfectly copes with waste immutables.
Akka Actors vs. other means
Akka has dispatcher that distributes all tasks across fixed thread pool. This allows to evenly consume available resources. This approach is much better than the fixed worker threads — the processing resources are provided to the tasks not DataFlow nodes.
DataFlow implementation
There is a SynapseGrid library that is built on top of Akka Actors and allows easy construction of DataFlow systems distributed over fixed immortal Actors. It can even draw the DataFlow diagram (in .dot format) of the whole system.
(The library is more convenient to be used with Scala.)