Why is my spring-boot application build up memory over time? - spring-boot

I have a Spring Boot application that is using Spring Integration. The application pulls messages from a RabbitMQ queue, transforms the data from that message, aggregates 50 transformed messages, put those messages in array and sends them to a RESTful endpoint as JSON. I am seeing memory slowly creeps up until the application crashes.
I ran a profiler on our application and there are instances of VariableLinkedBlockingQueue building up over time. The application seems to clean them up after the application starts, but after some time, the application will just build up these instances. I forced a full garbage collection through the profiler on my application and it cleaned up some instances, but they continue to build up. These instances only go up while messages are being sent to the queue. The prefetch is set to 50.
Why am I seeing these instances build up and how do I fix this?

com.rabbitmq.client.impl.WorkPool utilizes that VariableLinkedBlockingQueue and its logic is based on the register/unregister Channel as a client.
If you close Channels properly, they are unregistered from that pool and therefore their VariableLinkedBlockingQueue is garbage collected.
Not having your application to see and play is very difficult to determine where is the leak.
Spring Integration AMQP support exists already for a while and if there was such a problem do not close channel, we'd know that already.
Right now it looks like you use ConnectionFactory somehow out of the box and don't close channels/connections after usage.

Related

Advisable to run a Kafka producer + consumer in same application?

Spring + Apache Kafka noob here. I'm wondering if its advisable to run a single Spring Boot application that handles both producing messages as well as consuming messages.
A lot of the applications I've seen using Kafka lately usually have one separate application send/emit the message to a Kafka topic, and another one that consumes/processes the message from that topic. For larger applications, I can see a case for separate producer and consumer applications, but what about smaller ones?
For example: I'm a simple app that processes HTTP requests => send requests to a third party service, but to ensure retryability, I put the request on a Kafka queue with a service using the #Retryable annotation?
And what other considerations might come into play since it would be on the Spring framework?
Note: As your question states, what'll say is more of an advice based on my beliefs and experience rather than some absolute truth written in stone.
Your use case seems more like a proxy than an actual application with business logic. You should make sure that making this an asynchronous service makes sense - maybe it's good enough to simply hold the connection until you get a response from the 3p, and let your client handle retries if you get an error - of course, you can also retry until some timeout.
This would avoid common asynchronous issues such as making your client need to poll or have a webhook in order to get a result, or making sure a record still makes sense to be processed after a lot of time has elapsed after an outage or a high consumer lag.
If your client doesn't care about the result as long as it gets done, and you don't expect high-throughput on either side, a single Spring Boot application should be enough for handling both producer and consumer sides - while also keeping it simple.
If you do expect high throughput, I'd look into building a WebFlux based application with the reactor-kafka library - high throughput proxies are an excellent use case for reactive applications.
Another option would be having a simple serverless function that handles the http requests and produces the records, and a standard Spring Boot application to consume them.
TBH, I don't see a use case where having two full-fledged java applications to handle a proxy duty would pay off, unless maybe you have a really sound infrastructure to easily manage them that it doesn't make a difference having two applications instead of one and using more resources is not an issue.
Actually, if you expect really high traffic and a serverless function wouldn't work, or maybe you want to stick to Java-based solutions, then you could have a simple WebFlux-based application to handle the http requests and send the messages, and a standard Spring Boot or another WebFlux application to handle consumption. This way you'd be able to scale up the former in order to accommodate the high traffic, and independently scale the later in correspondence with your performance requirements.
As for the retry part, if you stick to non-reactive Spring Kafka applications, you might want to look into the non-blocking retries feature from Spring Kafka. This will enable your consumer application to process other records while waiting to retry a failed one - the #Retryable approach is deprecated in favor of DefaultErrorHandler and both will block consumption while waiting.
Note that with that you lose ordering guarantees, so use it only if the order the requests are processed is not important.

Parallel processing in multiple instances of spring boot application

I am not able to analyse, how to go ahead. I am using Spring boot 2, Oracle, IBM MQ.
I have made 2 async requests to external applications. I need to do some operation when I have received both of the responses.
I am not able to set it up as there are multiple instances of application running and listening to same queue for response.
I tried using #transactional and cyclic barrier. But I guess they will work only in scope of their own instance and not between multiple instances.
How should I proceed ahead?
It is also really difficult to reproduce the scenario where one message is read by one instance and other by other instance that too at the same time, where they eventually try to update db at same time.

JMS message processing with spring integration in cloud environment

I'm currently trying to refactor the processing of JMS messages to work in a distributed/cloud environment. To allow a better retry and error handling the messages are first stored to the database with a JPA entity and then read by spring integration jpa inbound adapter. This works fine as long as just a single instance of my service is running. However when multiple instances are running, the instances try to process the same message even after introducing a processing state on the persisted messages.
I have already tried to save the JMS messages in a JDBC message store, however then I would have to define a group identifier according to which an instance could select a message which is not really possible since the number of instances is dynamic and I can not assign a group id for each instance. Another possibility could be some kind of distributed lock with a LockRegistry but I couldn't make that work.
Do you have any hint/advice how I could implement the following requirements the best with spring integration:
JMS message should be persisted
Any instance can pick up the message and process it
If the processing fails there will be a retry for x times (could also be retried by another instance)
If an instance crashes or gets killed during the processing the message must not be lost
Is there maybe some spring-cloud component which could be helpful?
I'm happy about every hint in which direction I should go.

Deferred consumption of message queue

Sorry this might sound naive to JMS gurus, but still.
I have a requirement where a Spring based application is not able to connect synchronously to a SAP back-end (via their web-service interface) because the response from SAP is way too slow. We are thinking of a solution where the updates from GUI would be saved by the Spring middle-ware in a local database, simultaneously sending a message to a JMS queue. We want that after (say) every few hours (or may be nightly) a batch job runs to consume the message from the JMS queue, and based on the message contents, queries on the local database and sends the result to the SAP web-service.
Is this approach correct? Would I need a batch to trigger the JMS message consumption (because I don't want to consume the message immediately but in a deferred manner and at a pre-decided time)? Is there any way in Spring to implement this gracefully (like Camel)? Appreciate your help.
Spring Batch has a JmsItemReader that can be used in a batch program; an empty queue signals the end of the batch. Spring Cloud Task is built on top of batch and can be used for cloud deployments.

How to programatically defer JMS topic message consumption using Spring

I have an application that consumes messages from a JMS topic. As part of the normal application flow it needs to periodically cease consumption of messages. While the application is in this state new messages are stored in the topic (note that my application is still running). Later the application resumes message consumption, also receiving those messages that were placed on the topic while the application wasn't listening.
This functionality is currently achieved by creating and disposing of connections from a ConnectionFactory. However, I now wish to migrate the application to Spring JMS. Although Spring rather neatly abstracts away much of the JMS boiler-plate - I no longer appear to have fine grained control over the underlying connection and hence cannot halt message consumption on demand.
Before I try to wade through Spring JMS internals, can anyone suggest a neat way of doing this?
Can you just avoid returning from onMessage()? How long do you want to stop consumption? Is your problem similar to https://stackoverflow.com/a/628337/20734

Resources