How to launch a long running Java EE job?

How to launch a long running Java EE job? - jms

I need to fire off a long running batch type job, and by long we are talking about a job that can take a couple of hours. The ejb that has the logic to run this long running job will communicate to a NoSQL store and load data etc.
So, I am using JMS MDBs to do this asynchronously. However, as each job can potentially take up to an hour or more (lets assume 4 hours max), I dont want the onMessage() method in the MDB to be waiting for so long. So I was thinking of firing off an asynchronous ejb within the onMessage() MDB method so that the MDB can be returned to the pool right after the call to the batch ejb runner.
Does it make sense to combine an asynchrous ejb method call withing an MDB? Most samples suggest using 1 or the other to achieve the same thing.
If the ejb to be invoked from the MDB is not asynchrous then the MDB will be waiting for potentially long time.
Please advise.

I would simplify things: use #Schedule to invoke #Asynchronous and forget about JMS. One less thing that can go wrong.
Whilst not yet ready for prime time, JSR 352: Batch Applications looks very promising for this sort of stuff.
https://blogs.oracle.com/arungupta/entry/batch_applications_in_java_ee

It's a matter of taste I guess.
If you have a thread from the JMS pool running your job or if you have an async ejb do it, the end result will be the same - a thread will be blocked from some pool.
It is nothing wrong with spawning an async bean from a MDB, since you might want to have the jobs triggered by a messaging interface, but you might not want to block the thread pool. Also, consider that a transaction often time out by default way before an hour, so if you do MDB transactional by some reason, you might want to consider fire of that async ejb inside the onMessage.

I think Petter answers most of the question. If you are only using mdb to get asynch behaviour, you could just fire the #Asynchronous asap.
But if you are interested in any of the other features your JMS implementation might offer in terms reliability, persistent queues, slow consumer policies, priority on jobs you should stick to mdb:s
One of the reasons behind introducing #Asynchronous in ejb 3.1 is to provide a more lightweight way to do asynchronous processing when the other JMS/MDB features are not needed.

Related

Advisable to run a Kafka producer + consumer in same application?

Spring + Apache Kafka noob here. I'm wondering if its advisable to run a single Spring Boot application that handles both producing messages as well as consuming messages.
A lot of the applications I've seen using Kafka lately usually have one separate application send/emit the message to a Kafka topic, and another one that consumes/processes the message from that topic. For larger applications, I can see a case for separate producer and consumer applications, but what about smaller ones?
For example: I'm a simple app that processes HTTP requests => send requests to a third party service, but to ensure retryability, I put the request on a Kafka queue with a service using the #Retryable annotation?
And what other considerations might come into play since it would be on the Spring framework?

Note: As your question states, what'll say is more of an advice based on my beliefs and experience rather than some absolute truth written in stone.
Your use case seems more like a proxy than an actual application with business logic. You should make sure that making this an asynchronous service makes sense - maybe it's good enough to simply hold the connection until you get a response from the 3p, and let your client handle retries if you get an error - of course, you can also retry until some timeout.
This would avoid common asynchronous issues such as making your client need to poll or have a webhook in order to get a result, or making sure a record still makes sense to be processed after a lot of time has elapsed after an outage or a high consumer lag.
If your client doesn't care about the result as long as it gets done, and you don't expect high-throughput on either side, a single Spring Boot application should be enough for handling both producer and consumer sides - while also keeping it simple.
If you do expect high throughput, I'd look into building a WebFlux based application with the reactor-kafka library - high throughput proxies are an excellent use case for reactive applications.
Another option would be having a simple serverless function that handles the http requests and produces the records, and a standard Spring Boot application to consume them.
TBH, I don't see a use case where having two full-fledged java applications to handle a proxy duty would pay off, unless maybe you have a really sound infrastructure to easily manage them that it doesn't make a difference having two applications instead of one and using more resources is not an issue.
Actually, if you expect really high traffic and a serverless function wouldn't work, or maybe you want to stick to Java-based solutions, then you could have a simple WebFlux-based application to handle the http requests and send the messages, and a standard Spring Boot or another WebFlux application to handle consumption. This way you'd be able to scale up the former in order to accommodate the high traffic, and independently scale the later in correspondence with your performance requirements.
As for the retry part, if you stick to non-reactive Spring Kafka applications, you might want to look into the non-blocking retries feature from Spring Kafka. This will enable your consumer application to process other records while waiting to retry a failed one - the #Retryable approach is deprecated in favor of DefaultErrorHandler and both will block consumption while waiting.
Note that with that you lose ordering guarantees, so use it only if the order the requests are processed is not important.

Parallel processing in multiple instances of spring boot application

I am not able to analyse, how to go ahead. I am using Spring boot 2, Oracle, IBM MQ.
I have made 2 async requests to external applications. I need to do some operation when I have received both of the responses.
I am not able to set it up as there are multiple instances of application running and listening to same queue for response.
I tried using #transactional and cyclic barrier. But I guess they will work only in scope of their own instance and not between multiple instances.
How should I proceed ahead?
It is also really difficult to reproduce the scenario where one message is read by one instance and other by other instance that too at the same time, where they eventually try to update db at same time.

JMS listener using thread pool

I'm using a Spring project where I have an implementation of JMS Event listener to process messages from a queue.
To be precise, I'm using a SQS (AWS) queue.
All works fine.
My point is this:
I not configured anything about the concurrency but I would like to have more threads as listener to increase the performances (speed) about the messages processing from the queue.
I'm thinking about the possibility to configure a ThreadPool (TaskExecutor) and adding the annotation #Async on my methods about the message processing.
So, I will have a onMessage method into my listener where, after message validation I will call this async methods.
Is this a good practice? Will I have some issues using this approach?
I'm looking on the web for this and I see that I's possible to configure directly the concurrency value on the listener.
I'm very confusing there are a lot of possible ways to have this and I'm not able to understand the best approach.
Are these equivalent solutions?

Do not use #Async - simply increase the concurrency of the listener container and it will be handled for you automatically by spring-jms.

Spring scheduled task with jms

I'm just starting out with Spring (specifically I'm staring with Spring Boot) and want to create long running program that works on a scheduled task (i.e. #Scheduled), e.g. start processing between 7pm and 11pm. I'm ok with this bit.
The task will take a message from an ActiveMQ queue and process it, sleep a little, then get another and repeat.
Being new to JMS/ActiveMQ also, is it possible to use the Spring #JmsListener in conjunction with the scheduler to achieve this, and if so how?
If not, I take it my scheduled task should simply use point to point access to the queue to pull messages off. If so, does anyone have a simple example as I prefer to use Spring boot but can't find any good examples, they all seem to use listeners.
thanks.

Understanding JMS integration testing with Spring SingleConnectionFactory and CachingConnectionFactory

Please some help understanding the following:
I am using CachingConnectionFactory in my app and first used it during my jms tests to test my jms config like guaranteed delivery, rollback/commit, etc..
I am using Spring's JmsTemplate for sending and DefaultMessageListenerContainer during delivery.
I noticed that this is hard/impossible when using several test methods run sequential
Example: in test method A I throw exceptions in the Message listener (consumer side) such that retries occur.
Then test B is run and in method A I do a different test, but when I start this test I still get retry messages from test A, which I clearly not want.
I purge the Queue through jmx between tests, but still receive these retries :(...
I searched and debugged... I don't exactly understand why these retries keep comming up, even when I am sure that the purge occur correctly. Maybe it was already cached somewhere in the session... I don't know. Anybody any idea?
I found out that I needed to use the SingleConnectionFactory during testing. With this connection factory the retries disappear, but I don't really understand why. Why?
I understand that it uses only one connection (from the Spring ref), and noticed that it somehow removes the consumer after every send action, but I don't really understand what happen with these retries :(... Any idea?
(It's hard to debug because of the multi threading behavior and difficult to find good information about it on the web)
Also using CachingConnectionFactory with only one session cache size of 1 didn't solve the retry issue.
Thanks

Best bet would probably to use an embedded broker and start/stop it between each test, make sure deleteAllMessagesOnStartup is set to true and the broker should purge the store fore you, which will ensure you've got a clean slate for each test. You might also benefit from having a look at ActiveMQ's unit tests, it's a good source of examples of how the broker can be used in automated tests.

It's not an easy thing to fix: remove the messages between tests.
I tried many thingssss, like mentioned above: stop/start the broker and the class DefaultMessageListenerContainer of Spring that I use to consume my messages.
It all seem to work until I turned I set the cache level in DefaultMessageListenerContainer to Consumer such that the consumer is cached.
That is required such that the redeliveryPolicy works.
However, this messed up everything and messages where cached by DefaultMessageListenerContainer in some way, as it seemed.
At the end, I solved it by simple consuming all messages after a test (just wait a second and consume all Ok), such that the next test can begin.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio