What is the difference between spring scheduled tasks and spring batch jobs - spring

I dont understand the difference between scheduled tasks and batch jobs in spring.
By scheduled tasks I mean the ones which are configured like these:
#EnableScheduling
public class AppConfig{
..
and used like
#Scheduled(fixedRate=550)
public void doSomething(){
..
By batch jobs I mean these:
#EnableBatchProcessing
public class AppConfig{
..
and lots of implementations like:
Jobs, Job launcher, Steps, ItemReader, ItemWriter... etc
I would like to know the main difference between them besides the implementation differences and also I am curious why to use batch jobs and make a lot of long implementations while we can use simple scheduled tasks. I mean the implementation of scheduled jobs is quite easy but maybe they had disadvantages according to the batch jobs?

Spring Scheduler is for orchestrating something based on a schedule. Spring Batch is a robust batch processing framework designed for the building of complex compute problems. Spring Batch does not handle the orchestration of jobs, just the building of them. You can orchestrate Spring Batch jobs with Spring Scheduler if you want.

2 aspects which i can think of: afaik when a job-run fails, in 2. run, it will run with the same job parameters.. at least you can configure this i think. and this kind of error situations which you can configure more easily than writing all in code in the same place manually (your scheduled method). Secondly, maybe batch gives a structure to your code when you also have to read your data from somewhere, and write somewhere... batch has some kind of reader, processor, writer schema.. Also some automatically created database tables (BATCH_JOB_INSTANCE) and batch job results.. like when the job started etc...
Edit: More Reasons for a batch: large amount of data, Transaction management, Chunk based processing, Declarative I/O, Start/Stop/Restart, Retry/Skip, Web based administration interface.

Related

How to assess whether to use spring batch or scheduler in application?

I have a business logic already developed in spring boot that needs to be run once in every 60 days. I'm little confused on whether convert it to spring batch or use scheduler annotation.
What all factors should I consider to assess the same? Does either of them has any performance upper-hand over the other?
I'm new to the scheduler-batch concept and this is my first time work on the same.
How to assess whether to use spring batch or scheduler in application?
Spring Batch is not a scheduler, so it's not an either or question. You can use both, for example use a scheduler to schedule a Spring Batch job to run at a given time.
The question you should be asking is: is it worth transforming the business logic you already developed in spring boot in a Spring Batch job to benefit from what Spring Batch offers (the whole app could remain a boot app).
As a side note, since your job needs to be run every 60 days, using #Scheduled means you would have a JVM running for two months to run a job. Unless you are planning to use the same JVM for other things in the meantime, this would be an inefficient use of resources. Other scheduling mechanisms like cron is more appropriate in this case.

Spring task:scheduled or #Scheduler to restrict a Job to run in multiple instance

I have one #Scheduler job which runs on multiple servers in a clustered environment. However I want to restrict the job to run in only in one server and other servers should not run the same job once any other server has started it .
I have explored Spring Batch has lock mechanism using some Database table , but looking for any a solution only in spring task:scheduler.
I had the same problem and the solution what I implemented was a the Lock mechanism with Hazelcast and to made it easy to use I also added a proper annotation and a bit of spring AOP for that. So with this trick I was able to enforce a single schedule over the cluster done with a single annotation.
Spring Batch has this nice functionality that it would not run the job with same job arguments twice.
You can use this feature so that when a spring batch job kicks start in another server it does not run.
Usually people pass a timestamp as argument so it will by pass this logic, which you can change it.

What modules to use for a synchronization service in Java/Spring?

I'm willing to build a synchronization service in Java. The use case is, that i'm fetching data from an exchange-service (via Exchange Web Services), normalize the data a bit (process probably) and then write it to a backend via GraphQL. I already had a look around the spring modules, but am not quite sure what modules to use. I found spring batch and spring quartz.
The synchronization will have to trigger all X seconds, fetch information from the Exchange, look what's in the backend already and update what's needed.
Do you guys have any suggestions? I started implementing this whole thing in nodejs before, but as it has to run on both, Windows Servers and Docker/Linux, it has been a real pain to keep it running smooth (mostly because bundling nodejs to an application for Windows is pain).
Difference between Spring Batch & Quartz:
Spring Batch and Quartz have different goals. Spring Batch provides functionality for processing large volumes of data and Quartz provides functionality for scheduling tasks.
So Quartz could complement Spring Batch, A common combination would be to use Quartz as a trigger for a Spring Batch job using a Cron expression.
Conclusion : So basically Spring Batch defines what should be done, Quartz defines when it should be done.
Quartz is a scheduling framework. Like "execute something every hour or every last friday of the month"
Spring Batch is a framework that defines that "something" that will be executed.
You can define a job, that consists of steps. Usually a step is something that consists of item reader, optional item processor and item writer, but you can define a custom stem. You can also tell Spring batch to commit on every 10 items and a lot of other stuff.
You can use Quartz to start Spring Batch jobs.
Recommended for your use case :
Quartz scheduling as you want trigger after specific interval.
Reference :https://projects.spring.io/spring-batch/faq.html

Spring batch or Spring boot async method execution?

I have a situation where the data is to be read from 4 different web services, process it and then store the results in a database table. Also send a notification after this task is complete. The trigger for this process is through a web service call.
Should I write my job as a spring batch job or write the whole read/process code as an async method (using #Async) which is called from the Rest Controller?
Kindly suggest
In my opinion the your choice should be #Async, because Spring Batch was designed for large data processing and it isn't thought to processing on demand, typically you create a your batch and then launch the batch with a schedule. The benefit of this kind of architetture will be the reliability of your job that colud restarted in case of fail and so on. In your case you have a data integration problem and I can suggest to see at Spring Integration. You could have a Spring Integration pipeline that you start through a rest call.
I hope that this can help you
If there are large amounts of services should be executed, spring-batch would be the choice. Otherwise, I guess there is no need to import spring-batch.
In my opinion, #Async annotation is the easier way.
If both methods can work, of course simpler the better.
At the end, if there will be more and more service not only 4, spring-batch would be the better solution, cause spring-batch are professional in this.

Scheduling inside step using Spring-Batch

Part of my job is to poll DB for a specific result status and only after that I can continue with next job's steps.
Is it recommended to stop the job's process while doing polling in one of the steps(tasklet I guess) ?
Polling the DB for a specific result sounds like a situation where you need a scheduler.
Spring-batch assumes the scheduling of the job is done from outside it's scope.
You can use #Scheduled spring annotation if you want to keep all inside spring configuration or use an external tool like this.
If you have more complex situations, have a look at Spring Batch Integration.

Resources