How to ensure application availability when one or more microservices fail? - microservices

If a microservice is not responding due to any of the following reasons, how do we ensure the overall application availability?
Microservice crashes
Network partition happens or other transient error happens
Service is overloaded

other microservice calling the same microservice
If you have services calling one another, that doesn't sound like they are using Kafka, then.
If you have applications sending to Kafka, then those messages are persisted to the broker logs. Any downstream consumer can stay offline for as long as the messages are (configurably) retained in the Kafka cluster.
Ultimately, when using Kafka (any persistent message queue), services do not know about one another, and only the brokers.

You should avoid coupling in microservices architecture as much as possible.
In your case, I guess you are sending a read-only request to a microservice to get a data but called microservice is not up. So caller microservice can't do its job.
To avoid this kind of situations you can use data duplication technique. In this technique microservice which is the source of the data send insert, update, delete information about the data as an event with using a broker like Kafka. Then other microservices which also needs to this data get the data from corresponding topic. By this way, you don't need to make a read-only request to get the data. Then you will avoid coupling between microservices.
What will happen in that case?
In this case, if there is no redundancy for microservice which is called, caller microservice will get an exception like "No instances available for CalledMicroservice"

Related

Advisable to run a Kafka producer + consumer in same application?

Spring + Apache Kafka noob here. I'm wondering if its advisable to run a single Spring Boot application that handles both producing messages as well as consuming messages.
A lot of the applications I've seen using Kafka lately usually have one separate application send/emit the message to a Kafka topic, and another one that consumes/processes the message from that topic. For larger applications, I can see a case for separate producer and consumer applications, but what about smaller ones?
For example: I'm a simple app that processes HTTP requests => send requests to a third party service, but to ensure retryability, I put the request on a Kafka queue with a service using the #Retryable annotation?
And what other considerations might come into play since it would be on the Spring framework?
Note: As your question states, what'll say is more of an advice based on my beliefs and experience rather than some absolute truth written in stone.
Your use case seems more like a proxy than an actual application with business logic. You should make sure that making this an asynchronous service makes sense - maybe it's good enough to simply hold the connection until you get a response from the 3p, and let your client handle retries if you get an error - of course, you can also retry until some timeout.
This would avoid common asynchronous issues such as making your client need to poll or have a webhook in order to get a result, or making sure a record still makes sense to be processed after a lot of time has elapsed after an outage or a high consumer lag.
If your client doesn't care about the result as long as it gets done, and you don't expect high-throughput on either side, a single Spring Boot application should be enough for handling both producer and consumer sides - while also keeping it simple.
If you do expect high throughput, I'd look into building a WebFlux based application with the reactor-kafka library - high throughput proxies are an excellent use case for reactive applications.
Another option would be having a simple serverless function that handles the http requests and produces the records, and a standard Spring Boot application to consume them.
TBH, I don't see a use case where having two full-fledged java applications to handle a proxy duty would pay off, unless maybe you have a really sound infrastructure to easily manage them that it doesn't make a difference having two applications instead of one and using more resources is not an issue.
Actually, if you expect really high traffic and a serverless function wouldn't work, or maybe you want to stick to Java-based solutions, then you could have a simple WebFlux-based application to handle the http requests and send the messages, and a standard Spring Boot or another WebFlux application to handle consumption. This way you'd be able to scale up the former in order to accommodate the high traffic, and independently scale the later in correspondence with your performance requirements.
As for the retry part, if you stick to non-reactive Spring Kafka applications, you might want to look into the non-blocking retries feature from Spring Kafka. This will enable your consumer application to process other records while waiting to retry a failed one - the #Retryable approach is deprecated in favor of DefaultErrorHandler and both will block consumption while waiting.
Note that with that you lose ordering guarantees, so use it only if the order the requests are processed is not important.

Message Aggregation using SQS and SpringBoot

I have a use case/situation wherein, SQS(standard) will be flooded with messages (north of 500k+), a microservice (spring boot based) listens to these events, consumes it, and makes a rest API call (batch-based) to 3rd party SaaS system (have attached a high-level diagram for the same)
The limitation here is that the spring boot consumer can receive a max of 10 messages from the SQS, transform the payload, and makes the rest API call with these 10 messages(records).
Is there a way to aggregate these messages to say 100 messages, before making the rest API call (assuming that the target SaaS System accepts 100 records of data)? Would spring batch helps in this case?
Should I have to look at a different stack for this kind of need? Any help/guidance is much appreciated.
Thanks
What you are describing is actually the chunk-oriented processing model of Spring Batch: items could be read from the queue, accumulated in chunks of 100 items (that is the configurable chunk-size) and posted to your REST API in bulk mode.
Spring Batch handles the chunking of items (and much more) for you. So yes, even though I'm biased, I believe Spring Batch is a very good option for your use case.
Maybe you should try Spring Aggregator(Spring Integration).
The Aggregator combines a group of related messages, by correlating
and storing them until the group is deemed to be complete. At that
point, the aggregator creates a single message by processing the whole
group and sends the aggregated message as output.
https://docs.spring.io/spring-integration/reference/html/aggregator.html
And please refer to this GitHub repo for spring integration with AWS services
https://github.com/spring-projects/spring-integration-aws/tree/main/src/test/java/org/springframework/integration/aws
I'm assuming you are having multiple instances of your application and can scale up easily if required (since you have 500k+ messages). But still, your application is prone to data loss. So building a reliable system is always challenging. Since you are already on the cloud and maybe you should think about utilizing different cloud services.
I think for your case, you should have a look at the AWS Kinesis dataStream and Kinesis data fire hose.
You can refer this,
https://aws.amazon.com/blogs/big-data/stream-data-to-an-http-endpoint-with-amazon-kinesis-data-firehose/

JMS message processing with spring integration in cloud environment

I'm currently trying to refactor the processing of JMS messages to work in a distributed/cloud environment. To allow a better retry and error handling the messages are first stored to the database with a JPA entity and then read by spring integration jpa inbound adapter. This works fine as long as just a single instance of my service is running. However when multiple instances are running, the instances try to process the same message even after introducing a processing state on the persisted messages.
I have already tried to save the JMS messages in a JDBC message store, however then I would have to define a group identifier according to which an instance could select a message which is not really possible since the number of instances is dynamic and I can not assign a group id for each instance. Another possibility could be some kind of distributed lock with a LockRegistry but I couldn't make that work.
Do you have any hint/advice how I could implement the following requirements the best with spring integration:
JMS message should be persisted
Any instance can pick up the message and process it
If the processing fails there will be a retry for x times (could also be retried by another instance)
If an instance crashes or gets killed during the processing the message must not be lost
Is there maybe some spring-cloud component which could be helpful?
I'm happy about every hint in which direction I should go.

How to make Spring kafka client distributed

I have messages coming in from Kafka. So I am planning to write a listener and "onMessage". I want to process it and push it in to solr.
So my question is more architectural, like I have worked on web apps all my career, so in big data how to deploy the spring kafka listener, so I can process thousands of messages a second.
How do I make my spring code use multiple nodes to distribute the
load?
I am planning to write a SpringBoot application to run in
a tomcat container.
If you use the same group id for all instances, different partitions will be assigned to different consumers (instances of your application).
So, be sure that you specified enough partitions in the topic you are going to consume.

Usage of JMS to call a API which delivers a message

I would like to know if using a JMS in the below scenario is feasible or not.
I am adding a feature of calling an API service which will dispatch the emails to the customer.
So i thought of implementing a JMS in my application where i would put the events or messages in the queue and write a listener in the same application which will process the message and call the rest API service call which will dispatch the message to the customers.
My question was is it good to have a JMS in between the rest call and our application ?
Or should i directly call the rest api to dispatch the messages to the customer ?
I think that depends on the availability and overhead of your rest service.
If you know there will be times that your service will be down, but don't want to impact the process using the API, then JMS queues make since.
Or if you feel the rest service is causing a bottle neck from the API service side and want to queue up the messages somewhere where they can survive an outage of your own, JMS with a provider that supports persistent messages makes since in this case.
Using JMS would also open the door for completely decoupling the two. Whatever application hosts the rest service could just as easily be converted to pull messages from the JMS queue without a need to make a rest call if that seemed more efficient.
Just a few examples of how you could justify using JMS in this scenario.

Resources