accessing database between same instance of a micro service - microservices

In my project, I have a microservice [say A] and it has a SQL database. We have a 5 node cluster and each of the node this microservice runs. So, We have 5 instances running of service A on the cluster. Now, suppose there is a select query in a particular function of the microservice that is retrieving data from the database. Now, since 5 instance are running, all the 5 instance will use the same query and will work on the same data. Is there any way, in which, we can divide data among 5 instances of service A.

Application clustering is different to database clustering. You cannot "divide" data among the 5 instances of application services since all application instances require a similar set of data to function (unless your application is designed to work on a subset of the data, i.e. each application instance is used to serve a specific list of countries, then you might be able to break the data up by country).
You can look into clustering at the database level for ideas on how you can cluster at the SQL level: https://www.brentozar.com/archive/2012/02/introduction-sql-server-clusters/ .

Related

Horizontal scaling a microservice that processes a lot of data

Let’s say I have a microservice that needs to generate millions of reports with even more rows of data.
Business rules:
One client generates 0 to many reports on a single run
Many clients can be generating reports in a single
Any request to generate a report for a client that is currently processing should throw an error
The reports are generated on a schedule.
The schedule is stored in the database of the microservice (a) for each client. The schedule is managed by a separate microservice (b) and the data is replicated via integration events to microservice a.
Ex:
Client A, Schedule = today
Client B, Schedule = 3 days from now
Only client A will have a report generated.
Now, let’s say the microservice gets a request to generate all reports for clients configured to generate today. Since it has to generate millions of reports, we want it to horizontally scale.
However, I’m having a hard time identifying a great way to do this. Some ideas:
Only let one instance of the microservice a retrieve the clients that need to generate today. This can be polled in case that service fails and another can pick it up.
Insert this data into a shared cache
or into a topic or queue
that all other instances will process from. Scale based on the number of
messages in the topic.
Let another microservice (b) make the request for generation and pass in each request into a topic or queue that microservice (a) reads. However this introduces a dependency between services and can cause some data ownership ambiguities

Is it necessary to have separate db instance for each microservice instance of same microservice?

Say I have a microservice A and it has 3 instances. Similarly for microservice B there are 5 instances of B service.
I will have separate DB for Microservice A and B that is fine. But is it necessary to have separate db instance for each microservice instance of A or B?
I mean to say for the 3 instances of Microservice A, do I need to have 3 separate instances of db? Or all the instances of A will point to one db instance? Which is better approach?
The question is broad and there is no better or worst approach it purely depends on your use cases and user load.
From your question I understood you need to spawn separate instances of a single microservice, so ideally all instances of service A should have access to same data. You can create multi master architecture in which you can connect your services to each replica by setting up a multi master data base. This articles gives you an overview on it scale-out-blog. However doing this will be having many implication and you need to carefully design your services to achieve this.
You can also create separate instances pointing to same DB in which you don't need to care about replication and other complex issues.

Parallel processing of records from database table

I have a relational table that is being populated by an application. There is a column named o_number which can be used to group the records.
I have another application that is basically having a Spring Scheduler. This application is deployed on multiple servers. I want to understand if there is a way where I can make sure that each of the scheduler instances processes a unique group of records in parallel. If a set of records are being processed by one server, it should not be picked up by another one. Also, in order to scale, we would want to increase the number of instances of the scheduler application.
Thanks
Anup
This is a general question, so here's my general 2 cents on the matter.
You create a new layer managing the requesting originating from your application instances to the database. So, probably you will be building a new code/project running on the same server as the database (or some other server). The application instances will be talking to that managing layer instead of the database directly.
The manager will keep track of which records are requested hence fetch records that are yet to be processed upon each new request.

Distributed database design style for microservice-oriented architecture

I am trying to convert one monolithic application into micro service oriented architecture style. Back end I am using spring , spring boot frameworks for development. Front-end I am using angular 2. And also using PostgreSQL as database.
Here my confusion is that, when I am designing my databases as distributed, according to functionalities it may contain 5 databases. Means I am designing according to vertical partition. Then I am thinking to implement inter-microservice communication services to achieve the entire functionality.
The other way I am thinking that to horizontally partition the current structure. So my domain is based on some educational university. So half of university go under one DB and remaining will go under another DB. And deploy services according to Two region (two for two set of university).
Currently I am decided to continue with the last mentioned approach. I am new to these types of tasks, since it referring some architecture task. Also I am beginner to this microservice and distributed database world. Would someone confirm that my approach will give solution to my issue? Can I continue with my second approach - horizontal partitioning of databases according to domain object?
Can I continue with my second approach - Horizontal partitioning of
databases according to domain object?
Temporarily yes, if based on that you are able to scale your current system to meet your needs.
Now lets think about why on the first place you want to move to Microserices as a development style.
Small Components - easier to manager
Independently Deployable - Continous Delivery
Multiple Languages
The code is organized around business capabilities
and .....
When moving to Microservices, you should not have multiple services reading directly from each other databases, which will make them tightly coupled.
One service should be completely ignorant on how the other service designed its internal structure.
Now if you want to move towards microservices and take complete advantage of that, you should have vertical partition as you say and services talk to each other.
Also while moving towards microservices your will get lots and lots of other problems. I tried compiling on how one should start on microservices on this link .
How to separate services which are reading data from same table:
Now lets first create a dummy example: we have three services Order , Shipping , Customer all are three different microservices.
Following are the ways in which multiple services require data from same table:
Service one needs to read data from other service for things like validation.
Order and shipping service might need some data from customer service to complete their operation.
Eg: While placing a order one will call Order Service API with customer id , now as Order Service might need to validate whether its a valid customer or not.
One approach Database level exposure -- not recommened -- use the same customer table -- which binds order service to customer service Impl
Another approach, Call another service to get data
Variation - 1 Call Customer service to check whether customer exists and get some customer data like name , and save this in order service
Variation - 2 do not validate while placing the order, on OrderPlaced event check in async from Customer Service and validate and update state of order if required
I recommend Call another service to get data based on the consistency you want.
In some use cases you want a single transaction between data from multiple services.
For eg: Delete a customer. you might want that all order of the customer also should get deleted.
In this case you need to deal with eventual consistency, service one will raise an event and then service 2 will react accordingly.
Now if this answers your question than ok, else specify in what kind of scenario multiple service require to call another service.
If still not solved, you could email me on puneetjindal.11#gmail.com, will answer you
Currently I am decided to continue with the last mentioned approach.
If you want horizontal scalability (scaling for increasingly large number of client connections) for your database you may be better of with a technology that was designed to work as a scalable, distributed system. Something like CockroachDB or NoSQL. Cockroachdb for example has built in data sharding and replication and allows you to grow with adding server nodes as required.
when I am designing my databases as distributed, according to functionalities it may contain 5 databases
This sounds like you had the right general idea - split by domain functionality. Here's a link to a previous answer regarding general DB design with micro services.
In the Microservices world, each Microservice owns a set of functionalities and the data manipulated by these functionalities. If a microservice needs data owned by another microservice, it cannot directly go to the database maintained/owned by the other microservice rather it would call an API exposed by the other microservice.
Now, regarding the placement of data, there are various options - you can store data owned by a microservice in a NoSQL database like MongoDB, DynamoDB, Cassandra (it really depends on the microservice's use-case) OR you can have a different table for each micro-service in a single instance of a SQL database. BUT remember, if you choose a single instance of a SQL Database with multiple tables, then there would be no joins (basically no interaction) between tables owned by different microservices.
I would suggest you start small and then think about database scaling issues when the usage of the system grows.

Microservices: database and microservice instances

Lets say we have a microservice A and a B. B has its own database. However B has to be horizontally scaled, thus we end up having 3 instances of B. What happens to the database? Does it scale accordingly, does it stays the same (centralized) database for the 3 B instances, does it become a distributed database, what happens?
The answer is based on the which kind of data should be shared from 3 B instances. Some occasions:
The B is just read data without write anything, the DB can use replicate methodology, and three B instance just read data from different DB instance, and DB was replicated.
The B instance can read/write data without interrupt other B instance, that mean every B instance can have designated data, and no data sharing between instances, the database was changed to three databases with same schema but totally different data;
The B instances should share the most of data, and every instance can occasion write the data back to the DB. So B instance should use one DB and some DB lock to avoid conflict between the instances.
In other some different situation, there will be many other approaches to solve the issue such as using memory DB like redis, queue service like rabbitMQ for B instance.
using one database by mutliple service instances is ok when you are using data partitioning.
As explained by Chris Richardson in pattern database per service,
Instances of the same service should share the same database

Resources