Java Spring Boot Batch - Need some advise in design - spring-boot

I am a newbie in Java and trying to implement a Spring Boot batch application.
My requirement is like to check some data in database (one part) and delete if found (another part).
I am planning to implement Spring Boot batch for this.
I will have one job which will have 2 steps. If Step 1 find some data then only execute step 2? Can I achieve in Spring Boot Batch? Or what is the best way to achieve this keeping in mind I have to schedule this to run weekly.

With just the scheduled job for find and delete records from DB, I don't suggest using Spring Batch. Spring has nice good way of doing it without Batch using scheduling-tasks. You can see example here. Use Spring Batch only if you need to run jobs in batch that can't be handled with normal operation.
If you need complex scheduler, you can use Spring Quartz scheduler.

Related

How to assess whether to use spring batch or scheduler in application?

I have a business logic already developed in spring boot that needs to be run once in every 60 days. I'm little confused on whether convert it to spring batch or use scheduler annotation.
What all factors should I consider to assess the same? Does either of them has any performance upper-hand over the other?
I'm new to the scheduler-batch concept and this is my first time work on the same.
How to assess whether to use spring batch or scheduler in application?
Spring Batch is not a scheduler, so it's not an either or question. You can use both, for example use a scheduler to schedule a Spring Batch job to run at a given time.
The question you should be asking is: is it worth transforming the business logic you already developed in spring boot in a Spring Batch job to benefit from what Spring Batch offers (the whole app could remain a boot app).
As a side note, since your job needs to be run every 60 days, using #Scheduled means you would have a JVM running for two months to run a job. Unless you are planning to use the same JVM for other things in the meantime, this would be an inefficient use of resources. Other scheduling mechanisms like cron is more appropriate in this case.

Spring task:scheduled or #Scheduler to restrict a Job to run in multiple instance

I have one #Scheduler job which runs on multiple servers in a clustered environment. However I want to restrict the job to run in only in one server and other servers should not run the same job once any other server has started it .
I have explored Spring Batch has lock mechanism using some Database table , but looking for any a solution only in spring task:scheduler.
I had the same problem and the solution what I implemented was a the Lock mechanism with Hazelcast and to made it easy to use I also added a proper annotation and a bit of spring AOP for that. So with this trick I was able to enforce a single schedule over the cluster done with a single annotation.
Spring Batch has this nice functionality that it would not run the job with same job arguments twice.
You can use this feature so that when a spring batch job kicks start in another server it does not run.
Usually people pass a timestamp as argument so it will by pass this logic, which you can change it.

Spring batch or Spring core libraries for building file operation process

I'm dipping my toes into the microservices, is spring boot batch applicable to the following requirements?
Files of one or multiple are read from a specific directory in Linux.
Several operations like regex, build new files, write the file and ftp to a location
Send email during a process fail
Using spring boot is confirmed, now the question is
Should I use spring batch or just core spring framework?
I need to integrate with Control-M to trigger the job. Can the Control-M be completely removed by using Spring batch library? As we don't know when to expect the files in the directory.
I've not seen a POC with these requirements. Would someone provide an example POC or an affirmation this could be achieved with Spring batch?
I would use Spring Batch for that use case. Not only does it provide out of the box components for reading, processing, and writing files, it adds a lot more for error handling, scalability, etc. All of those things you'd probably end up wiring up by yourself if you go without Spring Batch.
As for being launched via Control-M, yes MANY large customers use Control-M to launch their jobs. Unfortunately, I've never done it myself so I cannot provide any details on the mechanics, but if Control-M can either launch a script or call a REST API, you can launch a job with it.
I would suggest you, go for spring batch as it has much-inbuilt functionality which will be provided to you for file reading and writing to your required location. Even you will be able to handle record skipping requirement. Your mail triggering requirement will be handled by Control M. You just need to decide one exit code for your handled exception and on the basis of that exit code you can trigger the mail to respective members. And there are many other features which will be helpful if you go for spring batch.

How do I kickoff a batch job when input file arrives?

We have Spring4 and Spring Batch 3 and our app consumes CSV files as input file. Currently we kick off the jobs manually from the command line, using CommandLineJobRunner with parms, including the name of the file to process.
I want to kick off a job to process asynchronously just as soon as the input file arrives in a monitored directory. How can we do that?
You may use java.nio.file.WatchService to monitor directory for a file.
Once file appears you may start (or kick off a job to process asynchronously) actual processing.
You may also use FileReadingMessageSource.WatchServiceDirectoryScanner from Spring Integration (https://docs.spring.io/spring-integration/reference/html/files.html#watch-service-directory-scanner)
Comparing release notes Spring Batch https://github.com/spring-projects/spring-batch/releases
to Spring Integration https://github.com/spring-projects/spring-integration/releases it looks that Spring Integration is released more often. It also has more features and Integration points.
In this case it looks like a overkill to bring Spring Integration if you just need to watch a directory for a file.
I would recommend using the powerful combination of Spring Batch with Spring Integration. For example, you can use a FileInboundChannelAdapter from Spring Integration to monitor a directory and start a Spring Batch Job as soon as the input file arrives.
There is a code example for this typical use case in the reference documentation of Spring Batch here: https://docs.spring.io/spring-batch/4.0.x/reference/html/spring-batch-integration.html#launching-batch-jobs-through-messages
I hope this helps.

What modules to use for a synchronization service in Java/Spring?

I'm willing to build a synchronization service in Java. The use case is, that i'm fetching data from an exchange-service (via Exchange Web Services), normalize the data a bit (process probably) and then write it to a backend via GraphQL. I already had a look around the spring modules, but am not quite sure what modules to use. I found spring batch and spring quartz.
The synchronization will have to trigger all X seconds, fetch information from the Exchange, look what's in the backend already and update what's needed.
Do you guys have any suggestions? I started implementing this whole thing in nodejs before, but as it has to run on both, Windows Servers and Docker/Linux, it has been a real pain to keep it running smooth (mostly because bundling nodejs to an application for Windows is pain).
Difference between Spring Batch & Quartz:
Spring Batch and Quartz have different goals. Spring Batch provides functionality for processing large volumes of data and Quartz provides functionality for scheduling tasks.
So Quartz could complement Spring Batch, A common combination would be to use Quartz as a trigger for a Spring Batch job using a Cron expression.
Conclusion : So basically Spring Batch defines what should be done, Quartz defines when it should be done.
Quartz is a scheduling framework. Like "execute something every hour or every last friday of the month"
Spring Batch is a framework that defines that "something" that will be executed.
You can define a job, that consists of steps. Usually a step is something that consists of item reader, optional item processor and item writer, but you can define a custom stem. You can also tell Spring batch to commit on every 10 items and a lot of other stuff.
You can use Quartz to start Spring Batch jobs.
Recommended for your use case :
Quartz scheduling as you want trigger after specific interval.
Reference :https://projects.spring.io/spring-batch/faq.html

Resources