Spring batch or Spring boot async method execution? - spring

I have a situation where the data is to be read from 4 different web services, process it and then store the results in a database table. Also send a notification after this task is complete. The trigger for this process is through a web service call.
Should I write my job as a spring batch job or write the whole read/process code as an async method (using #Async) which is called from the Rest Controller?
Kindly suggest

In my opinion the your choice should be #Async, because Spring Batch was designed for large data processing and it isn't thought to processing on demand, typically you create a your batch and then launch the batch with a schedule. The benefit of this kind of architetture will be the reliability of your job that colud restarted in case of fail and so on. In your case you have a data integration problem and I can suggest to see at Spring Integration. You could have a Spring Integration pipeline that you start through a rest call.
I hope that this can help you

If there are large amounts of services should be executed, spring-batch would be the choice. Otherwise, I guess there is no need to import spring-batch.
In my opinion, #Async annotation is the easier way.
If both methods can work, of course simpler the better.
At the end, if there will be more and more service not only 4, spring-batch would be the better solution, cause spring-batch are professional in this.

Related

How to assess whether to use spring batch or scheduler in application?

I have a business logic already developed in spring boot that needs to be run once in every 60 days. I'm little confused on whether convert it to spring batch or use scheduler annotation.
What all factors should I consider to assess the same? Does either of them has any performance upper-hand over the other?
I'm new to the scheduler-batch concept and this is my first time work on the same.
How to assess whether to use spring batch or scheduler in application?
Spring Batch is not a scheduler, so it's not an either or question. You can use both, for example use a scheduler to schedule a Spring Batch job to run at a given time.
The question you should be asking is: is it worth transforming the business logic you already developed in spring boot in a Spring Batch job to benefit from what Spring Batch offers (the whole app could remain a boot app).
As a side note, since your job needs to be run every 60 days, using #Scheduled means you would have a JVM running for two months to run a job. Unless you are planning to use the same JVM for other things in the meantime, this would be an inefficient use of resources. Other scheduling mechanisms like cron is more appropriate in this case.

What modules to use for a synchronization service in Java/Spring?

I'm willing to build a synchronization service in Java. The use case is, that i'm fetching data from an exchange-service (via Exchange Web Services), normalize the data a bit (process probably) and then write it to a backend via GraphQL. I already had a look around the spring modules, but am not quite sure what modules to use. I found spring batch and spring quartz.
The synchronization will have to trigger all X seconds, fetch information from the Exchange, look what's in the backend already and update what's needed.
Do you guys have any suggestions? I started implementing this whole thing in nodejs before, but as it has to run on both, Windows Servers and Docker/Linux, it has been a real pain to keep it running smooth (mostly because bundling nodejs to an application for Windows is pain).
Difference between Spring Batch & Quartz:
Spring Batch and Quartz have different goals. Spring Batch provides functionality for processing large volumes of data and Quartz provides functionality for scheduling tasks.
So Quartz could complement Spring Batch, A common combination would be to use Quartz as a trigger for a Spring Batch job using a Cron expression.
Conclusion : So basically Spring Batch defines what should be done, Quartz defines when it should be done.
Quartz is a scheduling framework. Like "execute something every hour or every last friday of the month"
Spring Batch is a framework that defines that "something" that will be executed.
You can define a job, that consists of steps. Usually a step is something that consists of item reader, optional item processor and item writer, but you can define a custom stem. You can also tell Spring batch to commit on every 10 items and a lot of other stuff.
You can use Quartz to start Spring Batch jobs.
Recommended for your use case :
Quartz scheduling as you want trigger after specific interval.
Reference :https://projects.spring.io/spring-batch/faq.html

Activiti vs Spring batch

I have got a use case to implement. It's basically a workflow kind of use case. Below is the requirements
Extract and import data from an external db to an internal db
Make this imported data into different formats and supply it to multiple external systems and invoke some script there. The external interfaces are SFTP, SOAP, JDBC, Python over CORBA. There are around 14 external systems with one of these interfaces.
Interface transactions are executed in around 15 steps, with the ability to run some steps in parallel
These steps should be configurable. ie, a particular flow may execute 10 of these 15 steps and another flow executes 15 of 15 steps
Should have the ability to restart each step individually or restart from a particular step
There are some steps that are manual and completion of manual step should trigger next step
Volume of data is not that large. Total data size is around 400k records. But this process is executing for around 30k records at a time. Time for development is less and we are looking for some light weight easy to learn and implement solution.
We are looking for Spring based or Spring integratable solutions.
The solutions we considered are
For workflow:
Activiti, Spring Batch
For interfaces:
Spring Integration
My question is
Can Spring batch considered for managing a work flow kind of use case? I don't think it's a best fit use case for Spring Batch but as its simple and easy to implement looked for its scope. We considered doing the interfaces interaction as each step in a batch job and inside the tasklet do the Spring Integration for external interfaces, with few issues as far as I understand are
a) Dynamic step configuration can be done with Java configuration, but how flexible it is and is it recommended?
b) Manual step processing is not possible in Spring Batch
Is there any work around for this? Is there any other issues or performance impacts on doing this?
Activiti seems to a solution. Can you please provide some feedback on Activiti with Spring and Spring integration for this use case and ease of implementing it? And support for Activiti
Can Activiti workflows restarted from a particular task? Is a task can be rollbacked?
Welcoming any suggestions !!
1) For managing workflows, Activiti would be a great choice. They have created a really good process engine which should comply your needs for delegating your tasks as well as calling your custom logic. Moreover, it is based entirely on Spring Framework so Integration with your logic would be easy.
2) i've provided the same in first answer.
3) No, you will have to create a new workflow for that and Yes!, a task can be rolled back.

Scheduling inside step using Spring-Batch

Part of my job is to poll DB for a specific result status and only after that I can continue with next job's steps.
Is it recommended to stop the job's process while doing polling in one of the steps(tasklet I guess) ?
Polling the DB for a specific result sounds like a situation where you need a scheduler.
Spring-batch assumes the scheduling of the job is done from outside it's scope.
You can use #Scheduled spring annotation if you want to keep all inside spring configuration or use an external tool like this.
If you have more complex situations, have a look at Spring Batch Integration.

Spring Batch for File Processing

Is Spring Batch a good fit for processing a a large number of individual files?
Spring Batch seems to be geared towards data-centric jobs. I've got a requirement to pull down several million files from an S3 bucket, unzip them, perform some logic based on the contents, then call a web service.
Implementing this by hand is trivial, but I don't much fancy re-inventing the wheel when it comes to tracking job executions, and how far a job got along before it failed. Spring Batch seems to be an ideal fit for this job-monitoring, but I'm not sure whether subverting it to do file processing is a step too far.
Short answer is Yes, you can use spring batch for this. I had done a small POC where we had to migrate millions of images from source system to target system in a batch process and it works well IMHO.
Adding on to comment by #Prasanna Talakanti, I would suggest to use a combination of Spring Integration and Spring Batch. While Spring batch will provide you infrastructure for batch processing (Commit at intervals, restart job if failed etc), Spring integration will provide you things around web service gateways.
In Spring batch, you can define reader for reading data from S3 and writer for writing to your destination with processor in between if needed. You could also fine tune the commit interval so if the job fails in between, you have a point of rollback.

Resources