Spring Batch disable Spring Boot AutoConfiguration for specific jobs - spring-boot

I have multiple jobs for my Spring Batch application, but only a single job uses some specific Spring Boot auto configuration features:
a job that uses spring-data-jpa auto configuration, to configure a database for business transactions (not for Spring Batch management)
a job that does not use the database at all
I have packaged both jobs in the same unit because it makes sense from business perspective. Both jobs will work together and the output of one job will be the input of the other job.
Is it possible to disable database specific auto configuration when I run the second job?

I just tried using profiles and I have disabled the autoconfiguration for a specific profile. I am pretty happy with this solution but I wonder if there are other solutions?
This is similar to trying to lazy load beans specific to a given job: How to apply something like #Lazy to Spring Batch?. While the Spring profiles feature may fix your issue, I believe it is working around the root issue which is packaging all jobs in a monolithic way.
I would package each job separately and this problem (and many others) disappears by design. There are several advantages to this approach:
Independent lifecyle management (bugs, features, etc)
Flexible deployment
Separate logs
Separate configurations (as in the current issue)
Easier/Better scalability
And all the good reasons to make one thing do one thing and do it well.

Related

Spring task:scheduled or #Scheduler to restrict a Job to run in multiple instance

I have one #Scheduler job which runs on multiple servers in a clustered environment. However I want to restrict the job to run in only in one server and other servers should not run the same job once any other server has started it .
I have explored Spring Batch has lock mechanism using some Database table , but looking for any a solution only in spring task:scheduler.
I had the same problem and the solution what I implemented was a the Lock mechanism with Hazelcast and to made it easy to use I also added a proper annotation and a bit of spring AOP for that. So with this trick I was able to enforce a single schedule over the cluster done with a single annotation.
Spring Batch has this nice functionality that it would not run the job with same job arguments twice.
You can use this feature so that when a spring batch job kicks start in another server it does not run.
Usually people pass a timestamp as argument so it will by pass this logic, which you can change it.

When spring-boot app multi-node deploy, how to handle cron job?

When I use spring task handle a simple sync job! But when I deploy multi-node, how I make sure the cron job just run one time.
Maybe you say that:
1. Use distributed-lock control a flag before the crob job run.
2. Integrated quartz cluster function.
But I hope spring task #EnableScheduling can add a flag argument, so as we can set a flag when launch app.
We are using https://github.com/lukas-krecan/ShedLock with success, zookeeper provider in particular.
Spring boot, in a nutshell, doesn't allow any type of coordination between multiple instances
of the same microservice.
All the work of such a coordination is done by the third parties that spring boot gets integrated with.
One example of this is indeed a #Scheduled annotation.
Another is DB migration support via flyway.
When many nodes start and the migration has to be done, flyway is responsible to lock the migration table by itself, spring boot has nothing to do with it.
So, bottom line, there is no such support and all options that you've raised can work.

What modules to use for a synchronization service in Java/Spring?

I'm willing to build a synchronization service in Java. The use case is, that i'm fetching data from an exchange-service (via Exchange Web Services), normalize the data a bit (process probably) and then write it to a backend via GraphQL. I already had a look around the spring modules, but am not quite sure what modules to use. I found spring batch and spring quartz.
The synchronization will have to trigger all X seconds, fetch information from the Exchange, look what's in the backend already and update what's needed.
Do you guys have any suggestions? I started implementing this whole thing in nodejs before, but as it has to run on both, Windows Servers and Docker/Linux, it has been a real pain to keep it running smooth (mostly because bundling nodejs to an application for Windows is pain).
Difference between Spring Batch & Quartz:
Spring Batch and Quartz have different goals. Spring Batch provides functionality for processing large volumes of data and Quartz provides functionality for scheduling tasks.
So Quartz could complement Spring Batch, A common combination would be to use Quartz as a trigger for a Spring Batch job using a Cron expression.
Conclusion : So basically Spring Batch defines what should be done, Quartz defines when it should be done.
Quartz is a scheduling framework. Like "execute something every hour or every last friday of the month"
Spring Batch is a framework that defines that "something" that will be executed.
You can define a job, that consists of steps. Usually a step is something that consists of item reader, optional item processor and item writer, but you can define a custom stem. You can also tell Spring batch to commit on every 10 items and a lot of other stuff.
You can use Quartz to start Spring Batch jobs.
Recommended for your use case :
Quartz scheduling as you want trigger after specific interval.
Reference :https://projects.spring.io/spring-batch/faq.html

Spring Integration Invoking Spring Batch

Just looking for some information if others have solved this pattern. I want to use Spring Integration and Spring Batch together. Both of these are SpringBoot applications and ideally I'd like to keep them and their respective configuration separated, so they are both their own executable jar. I'm having problems executing them in their own process space and I believe I want, unless someone can convince me otherwise, each to run like they are their own Spring Boot app and initialize themselves with their own profiles and properties. What I'm having trouble with though is the invocation of the job in my SpringBatch project from my SpringIntegration project. At first I couldn't get the properties loaded from the batch project, so I realized I need to pass the spring.active.profiles as a Job Parameter and that seemed to solve that. But there are other things in the Spring Boot Batch application that aren't loading correctly like the schema-platform.sql file and the database isn't getting initialized, etc.
On this initial launch of the job I might want the response to go back to Spring Integration for some messaging on Job Status. There might be times when I want to run a job without Spring Integration kicking off the job, but still take advantage of sending statuses back to the Spring Integration project providing its listening on a channel or something.
I've reviewed quite a few Spring samples and have yet to find my exact scenario, most are with the two dependencies in the same project, so maybe I'm doing something that's not possible, but I'm sure I'm just missing a little something in the Spring configuration.
My questions/issues are:
I don't want the Spring Integration project to know anything about the SpringBatch configuration other than the job its kicking off. I have found a good way to do that reference to the Job Bean without getting my entire batch configuration loading.
Should I keep these two projects separated or would it be better to combine them since I have two-way communication between both.
How should the Job be launch from the integration project. We're using the spring-batch-integration project with JobLaunchRequest and JobLauncher. This seems to run it in the same process as the Spring Integration project and I'm missing a lot of my SpringBootBatch projects initialization
Should I be using a CommandLineRunner instead to force it to another process.
Is SpringApplication.run(BatchConfiguration.class) the answer?
Looking for some general project configuration setup to meet these requirements.
Spring Cloud Data Flow in combination with Spring Cloud Task does exactly what you're asking. It launches Spring Cloud Task applications (which can contain batch jobs) as new processes on the platform you choose. I'd encourage you to check out that project here: http://cloud.spring.io/spring-cloud-dataflow/

Activiti vs Spring batch

I have got a use case to implement. It's basically a workflow kind of use case. Below is the requirements
Extract and import data from an external db to an internal db
Make this imported data into different formats and supply it to multiple external systems and invoke some script there. The external interfaces are SFTP, SOAP, JDBC, Python over CORBA. There are around 14 external systems with one of these interfaces.
Interface transactions are executed in around 15 steps, with the ability to run some steps in parallel
These steps should be configurable. ie, a particular flow may execute 10 of these 15 steps and another flow executes 15 of 15 steps
Should have the ability to restart each step individually or restart from a particular step
There are some steps that are manual and completion of manual step should trigger next step
Volume of data is not that large. Total data size is around 400k records. But this process is executing for around 30k records at a time. Time for development is less and we are looking for some light weight easy to learn and implement solution.
We are looking for Spring based or Spring integratable solutions.
The solutions we considered are
For workflow:
Activiti, Spring Batch
For interfaces:
Spring Integration
My question is
Can Spring batch considered for managing a work flow kind of use case? I don't think it's a best fit use case for Spring Batch but as its simple and easy to implement looked for its scope. We considered doing the interfaces interaction as each step in a batch job and inside the tasklet do the Spring Integration for external interfaces, with few issues as far as I understand are
a) Dynamic step configuration can be done with Java configuration, but how flexible it is and is it recommended?
b) Manual step processing is not possible in Spring Batch
Is there any work around for this? Is there any other issues or performance impacts on doing this?
Activiti seems to a solution. Can you please provide some feedback on Activiti with Spring and Spring integration for this use case and ease of implementing it? And support for Activiti
Can Activiti workflows restarted from a particular task? Is a task can be rollbacked?
Welcoming any suggestions !!
1) For managing workflows, Activiti would be a great choice. They have created a really good process engine which should comply your needs for delegating your tasks as well as calling your custom logic. Moreover, it is based entirely on Spring Framework so Integration with your logic would be easy.
2) i've provided the same in first answer.
3) No, you will have to create a new workflow for that and Yes!, a task can be rolled back.

Resources