Microservice for text search only - microservices

I have a monolith (a big one) and I need text search capabilities.
Is it correct from architectural point of view
to create a new microservice that only returns a text search query result?
I mean say I have a user that want to search something but instead of the monolith doing the work it only accept the user input and calls the microservice that doing the text search work

It depends on that how you design your database. according to your monolith it seems that you have one database. if you create a service for just text searching, i think it's not a good idea. the first thing in microservice is that per service has its own database and the services should not depend each other.
and for text searching the best database is NoSql.
if you can separate your database that can be happened. but if you want to separate database you should think for syncing data of them,(for your text searching) which this has its own patterns and of course its problems. I think so if you have time(and if you really need microservices), refactoring your monolith service to microservice, if you don't, the monolith is not that so bad 😄

Related

Is it possible to replicate tables from multiple databases in Google Cloud?

The company that I work at uses a microservices architecture with the 'database per service' pattern. This pattern makes it harder to query based on data from multiple services, since each service has its own database. Imagine a service for managing your products and one for managing stock. You would have to somehow combine the data from both services to query for products based on stock.
I know that event sourcing and API composition are potential solutions to the problem, but I was wondering if it is possible to continuously replicate specific tables from the product and stock databases based on database transaction logs. Wouldn't this be much simpler than say implementing an event based solution like event sourcing? One service that I am working with contains a lot of domain events, which would make implementing and maintaining event-based solution rather complex.
Another reason for why I am considering to look at the problem from a different angle is that there is a lot of data. In-memory joins with say API composition will most likely be slow.
To sum it all up, I would like to know if it is possible to continuously replicate specific tables from different databases into one database.
The technologies that my company uses are primarily Spring Framework and PostgreSQL.
I would step back and ask why you have microservices (including why you have multiple databases). This is because it's quite easy to make choices that are superficially easy but which achieve that ease by negating the reason you had the microservices to begin with, and in such a situation, it may in fact be easier to just not do microservices.
For example, you might be doing microservices because you want to be able to have the team maintaining your product service be able to make changes without coordinating with the stock service or vice versa. By setting up a direct replication of a table from service A's database into service B's database, you essentially require many changes service A might want to make to that table to be coordinated with service B. It's perhaps less operationally coupled than unifying the services into a monolith, but in terms of developer velocity, you're giving up a fair amount.
Alternatively, if the rationale is to allow one service to be down (failures, maintenance, releases: doesn't matter) without taking the others down, a replication which guarantees strong consistency implies that taking service B's database down prevents service A from updating its database (because if you allowed service A to update its database in that situation, you couldn't have strong consistency).
Rather than direct replication, it might make sense to use change data capture (e.g. with Debezium) to publish a stream of changes from the transaction logs (e.g. to Kafka). The critical difference from logical replication is that the consumer can, for instance, choose to ignore updates to columns it doesn't care about: the stock service might include details like where things are stocked in a warehouse, for instance, which is data you don't need for answering a query like "show me the products in this category which are in stock". This can be a nice middle ground between going full event-sourcing and other approaches.

Microservices "JOINS"

Let's say we want to create the app with microservices.
We have some page where we display some items (products).
These products have multiple joins(categories, tags, users, and so on).
If users, categories data are within another services, how can we manage and filter the results?
For example in SQL you create 3,4 joins and get.
With microservices - I have to filter the categories, then filter tags and then products - this could be 10 time slower than the speed of the SQL query.
Also if I have table "products_categories" which set categories for each product which service is responsible for that? Product service or Category service ?
Thank you
In Microservices architecture there are two ways to deal with it.
The API composition pattern— This is the simplest approach and should be used whenever possible. It works by making clients of the services that own the data responsible for invoking the services and combining the results.
The Command query responsibility segregation (CQRS) pattern— This is more powerful than the API composition pattern, but it’s also more complex. It maintains one or more view databases whose sole purpose is to support queries.
I will prefer to use CQRS, Define a view database, which is a read-only replica to support specifically that query. The rest of the services keeps the replica up to date by subscribing to (create, update, insert)events published by the data owner services.
This is a very standard problem whenever any micro-service is built.. People just always feel micro-service is the solution for everything which is not true.
Solution to this problem is designing better. Designing so that there is a balance between performance and redundancy of data. Higher performance ( lower latency numbers ) means more duplicacy of data across different databases of microservice. You should not target to achieve performance as good as SQL Joins ; but also do not duplicate data too much. A balance is needed..
Most importantly, dividing the requirement into right set of micro-services is needed.
I assume you created a "microservice" per database table. Those are not microservices, those are just HTTP-based CRUD interfaces to your database.
First, know why you need microservices. (Is there an actual reason?) Second, you have to create microservices that encompass at least one full (business) functionality for your software. Meaning it doesn't need other services to do it.
If you need a table that needs data from multiple microservices, you by definition made wrong microservices. If a microservice can't provide it's own UI without the help of other services, it doesn't fully contain it's own functionality.
What's stopping you from having multiple services for reading / writing to the same database / table? For example:
One service to write to categories
One service to write to tags
One service to write to products
You could then write another service to read from all three of these services, however, this might not be at a HTTP level, instead you could read from the same database within your read service and leverage the power of SQL.
The service that reads could encompass your join logic which would mean you wouldn't need to consume the other services around it.

Architecture question about one Elasticsearch instance per microservice

I like the approach of microservices. Easily(or easier) to deploy, to manage, to develop and so on than a monolith. The microservices pattern says one database instance per microservice, in the most cases that isn't a matter but in some cases it is. I explain my problem with an example.
I have a web service where the users can upload e.g. an image and other users can comment and rate it and there's a view counter. Now I would implement 4 services.
Upload Image Service
the user uploads its image to the website
the image has some meta information like description, title, tags, upload date
Comment Service
if an user adds a comment to the image then this service handles the request and creates an entry in the database with the attributes content, videoId, userId and date
View Counter Service
always if an user views/clicks that image a new request to service will be created and an new entry in the DB with the user id and video id is stored
Each service has its own database and all services are completely independent to each other. The communication between services is only via REST API. The DB is ElasticSearch.
And here comes the problem. I will create a fourth service the "Image Search Service". It's a really common task like the search function in youtube.
For the best search results I need each of the attributes/information from the preceding 3 services. The search is depending on, of course, the tags, description and the upload date but the likes/dislike have an influence and views and comments too. An image with a high view count will be ranked higher, for example.
But when I store all this information in separate DBs then I can not consider it in one query, but I think this necessary by a full text search.
Has someone may be some experiences or some ideas to solve this problem or is there may be a best practice? I rode something about event sourcing but that's not the right solution for this special problem.
Of course I can create three requests to each service and then create an algorithm and merge the results by myself, but I think elasticsearch is the right man for this job.
JHipster uses elasticsearch on top of a mysql DB. Maybe this could be a solution.

Distributed database design style for microservice-oriented architecture

I am trying to convert one monolithic application into micro service oriented architecture style. Back end I am using spring , spring boot frameworks for development. Front-end I am using angular 2. And also using PostgreSQL as database.
Here my confusion is that, when I am designing my databases as distributed, according to functionalities it may contain 5 databases. Means I am designing according to vertical partition. Then I am thinking to implement inter-microservice communication services to achieve the entire functionality.
The other way I am thinking that to horizontally partition the current structure. So my domain is based on some educational university. So half of university go under one DB and remaining will go under another DB. And deploy services according to Two region (two for two set of university).
Currently I am decided to continue with the last mentioned approach. I am new to these types of tasks, since it referring some architecture task. Also I am beginner to this microservice and distributed database world. Would someone confirm that my approach will give solution to my issue? Can I continue with my second approach - horizontal partitioning of databases according to domain object?
Can I continue with my second approach - Horizontal partitioning of
databases according to domain object?
Temporarily yes, if based on that you are able to scale your current system to meet your needs.
Now lets think about why on the first place you want to move to Microserices as a development style.
Small Components - easier to manager
Independently Deployable - Continous Delivery
Multiple Languages
The code is organized around business capabilities
and .....
When moving to Microservices, you should not have multiple services reading directly from each other databases, which will make them tightly coupled.
One service should be completely ignorant on how the other service designed its internal structure.
Now if you want to move towards microservices and take complete advantage of that, you should have vertical partition as you say and services talk to each other.
Also while moving towards microservices your will get lots and lots of other problems. I tried compiling on how one should start on microservices on this link .
How to separate services which are reading data from same table:
Now lets first create a dummy example: we have three services Order , Shipping , Customer all are three different microservices.
Following are the ways in which multiple services require data from same table:
Service one needs to read data from other service for things like validation.
Order and shipping service might need some data from customer service to complete their operation.
Eg: While placing a order one will call Order Service API with customer id , now as Order Service might need to validate whether its a valid customer or not.
One approach Database level exposure -- not recommened -- use the same customer table -- which binds order service to customer service Impl
Another approach, Call another service to get data
Variation - 1 Call Customer service to check whether customer exists and get some customer data like name , and save this in order service
Variation - 2 do not validate while placing the order, on OrderPlaced event check in async from Customer Service and validate and update state of order if required
I recommend Call another service to get data based on the consistency you want.
In some use cases you want a single transaction between data from multiple services.
For eg: Delete a customer. you might want that all order of the customer also should get deleted.
In this case you need to deal with eventual consistency, service one will raise an event and then service 2 will react accordingly.
Now if this answers your question than ok, else specify in what kind of scenario multiple service require to call another service.
If still not solved, you could email me on puneetjindal.11#gmail.com, will answer you
Currently I am decided to continue with the last mentioned approach.
If you want horizontal scalability (scaling for increasingly large number of client connections) for your database you may be better of with a technology that was designed to work as a scalable, distributed system. Something like CockroachDB or NoSQL. Cockroachdb for example has built in data sharding and replication and allows you to grow with adding server nodes as required.
when I am designing my databases as distributed, according to functionalities it may contain 5 databases
This sounds like you had the right general idea - split by domain functionality. Here's a link to a previous answer regarding general DB design with micro services.
In the Microservices world, each Microservice owns a set of functionalities and the data manipulated by these functionalities. If a microservice needs data owned by another microservice, it cannot directly go to the database maintained/owned by the other microservice rather it would call an API exposed by the other microservice.
Now, regarding the placement of data, there are various options - you can store data owned by a microservice in a NoSQL database like MongoDB, DynamoDB, Cassandra (it really depends on the microservice's use-case) OR you can have a different table for each micro-service in a single instance of a SQL database. BUT remember, if you choose a single instance of a SQL Database with multiple tables, then there would be no joins (basically no interaction) between tables owned by different microservices.
I would suggest you start small and then think about database scaling issues when the usage of the system grows.

Microservices per DB table?

Person
NativeCountry
SpokenLanguages
Had a query about MIcroservice granularity. Will try to explain my query with an example.
Assume I have above 3 tables in database, with Many to one relationship between Person -> NativeCountry table. One to Many relationship between person -> LanguagesSpoken in database.
Front end Application is suppose do CRUD operation on person entity and will also have capability to retrieve people based on nativecountry or spokenlanguage.
Does it makes sense to develop 3 independent microservices for each of the entities and then use Aggregator Microservice at upper layer to build combined data for UX layer or I should think of combining those to build just single microservice?
From your description of the problem, it sounds like "people" are at the center of the functionality and the use case of the service if I understand this correctly.
Search for people by native country
Search for people by language
Add a person with both their native country and the languages spoken
List all the languages
Since the three required features are around people and one feature requiring just listing the languages, I would argue that this should be one microservice (again without knowing if there are external services that depends on the other possible entity services). My argument here would be that in order to serve requests, people is the entity of interest with the native country and language being just a dimension to retrieve users.
If you break each of the entities, people, language, and country into different microservices, the services would be too small and the complexity would increase eg. you might need to make multiple requests to multiple services to generate a single response while there may not be a need to. As for the one last feature that doesn't quite revolve around people, I would say that its too small of a feature to be in a microservice. Until there becomes a need for the last feature to be a standalone service, I would advise for putting this into the "people" microservice.

Resources