Spring batch step issue - spring

SpringJMSListener on receiving the message invokes the spring batch job. It is configured to use DefaultMessageListenerContainer with concurrency of 5 and max concurrency of 15.
Spring batch has job definition of 4 steps that are configured as tasklets.
When there are multiple request submitted, jms listener picks up 5 messages and runs the spring batch.
But occasionally, few jobs takes more time to execute when it moves from one step to another step. Couldnt find any specific reason on why spring batch takes more time from between step execution. That is after the step is completed and before the next step is started. This doesnt happen always.
Any insights on this specific problem?

Related

Asynchronous Kafka consumer in Spring Batch Application

In our Spring Batch application workers, item processors are further interacting with another service asynchronously through Kafka. The requirement here is we required an acknowledgement in order to retry failed batches and the condition is to not wait for the acknowledgement.
Is there any mechanism in spring batch by which we can asynchronously consume Kafka ?
Is it possible to rerun specific local worker step in rerun of job?
We implement producers and consumers over same step using Spring batch decider. Thus, during the first run it will only produce Kafka and on second run it will consume the Kafka.
We are looking for solution where we can asynchronously consume Kafka in Spring batch application in order to rerun specific worker step.
Is there any mechanism in spring batch by which we can asynchronously consume Kafka ? Is it possible to rerun specific local worker step in rerun of job?
According to your diagram, you are doing that call from an item processor. The closest "feature" you can get from Spring Batch is the AsyncItemProcessor. This is a special processor that processes items asynchronously in a separate thread. The callback is unwrapped in an AsyncItemWriter with the result of the call.
Other than that, I do not see any other obvious way to do that with a built-in feature from Spring Batch. So you would have to manage that in a custom ItemProcessor.

How to stop jobs from Spring Cloud Data Flow immediately

I have used Spring Cloud Data Flow to control some batch jobs. In SCDF, after I defined some tasks, they were launched as jobs with running status. When I tried to stop a particular job, It did not stop immediately. I have found that the job was still running until it finished it's current step.
For example, My job 'ABC' has 2 steps A and B. In SCDF, I stop job 'ABC" when step A is being executed and job 'ABC' is still running until step A is completed and it do not implement step B.
So, Are there any ways to stop a Job immediately from Spring Cloud Data Flow?
From the Spring Cloud Data Flow, the batch job stop operation is delegated to the Spring Batch API. This means there is nothing Spring Cloud Data Flow offers to stop a batch job immediately as it needs to be handled by the Spring Batch or the job implementation itself.
When a batch job stop request is sent for a Batch Job execution (if it is running), the current step execution is set with the flag terminateOnly to true which means the step execution is ready to be stopped based on the underlying step execution implementation.

Create a job from Jms message receive

Using wildfly 15 and only JavaEE (no spring) I need to consume messages from a Jms queue, in order and create for every message a new job using Jbatch, in sequence, without job overlap.
For example:
JMS queue: --> msgC --> msgB --> msgA
Jbatch:
on receive msgC, create JobC, run jobC
wait for JobC to end, watching JMS queue, on receive msgB, create JobB, run JobB
wait for JobB to end, watching JMS queue, on receive msgA, create JobA, run JobB
It's possible to achieve this ?
Processing messages in parallel or the right sequence is some standard behaviour in JMS clients and you can simply configure to do it right. That's why you have a queue. Just ensure you have only one message driven bean working on it, which should ensure you have one process and nothing running in parallel.
If you handover the task to the batch API, a different set of threads will process it, and now you need to manually ensure one job terminates before the next can start. So your message driven bean would have to poll and wait until the batch executed.
Why would you do this as it just makes your life more complicated?
I believe you could still benefit from the easy orchestration of batch steps, the restart capability or some parallel execution which you would have to cover in your message driven bean yourself.

Spring scheduled task with jms

I'm just starting out with Spring (specifically I'm staring with Spring Boot) and want to create long running program that works on a scheduled task (i.e. #Scheduled), e.g. start processing between 7pm and 11pm. I'm ok with this bit.
The task will take a message from an ActiveMQ queue and process it, sleep a little, then get another and repeat.
Being new to JMS/ActiveMQ also, is it possible to use the Spring #JmsListener in conjunction with the scheduler to achieve this, and if so how?
If not, I take it my scheduled task should simply use point to point access to the queue to pull messages off. If so, does anyone have a simple example as I prefer to use Spring boot but can't find any good examples, they all seem to use listeners.
thanks.

Recommended approach for parallel spring batch jobs

The Spring Batch Integration documentation explains how to use remote chunking and partitioning for steps, see
http://docs.spring.io/spring-batch/trunk/reference/html/springBatchIntegration.html#externalizing-batch-process-execution
Our jobs do not consist of straightforward reader/processor/writer steps. So we want to simply have whole jobs running in parallel, with each job being farmed out to different partitions.
Is there already a pattern for this in Spring Batch? Or would I need to implement my own JobLauncher to maintain a pool of slaves to launch jobs on?
Cheers,
Menno
Spring Batch specifically takes the position of not handling job orchestration (which your question fundamentally is about). There are a couple of approaches for something like this:
Distributed Scheduler - Most distributed schedulers have the ability to execute tasks on multiple nodes. Quartz has a distributed mode for example.
Using remote partitioning for orchestration - Remote partitioning executes full Spring Batch steps as slaves. There's no reason those steps couldn't be job steps that execute a job.
Message driven job launching - Spring Batch Integration (a child module of Spring Batch) provides the facilities to launch jobs via messages. Another approach would be to have a collection of slaves listening to a queue waiting for a message to launch a job. You'd have to handle things like load balancing between the slaves some way but this is another common approach of handling job orchestration.

Resources