We are migrating XML-style Spring Batch jobs to JavaConfig, and found out that it seems not possible to use a decider() as the first step in the job flow logic, right after start(). We need to put a dummy step in order to invoke a decider.
However, in XML, this configuration works perfectly fine:
<batch:decision decider="exitsDecider" id="exitsInstance">
<batch:next on="CONTINUE" to="jobStartStep" />
<batch:end on="COMPLETED" exit-code="COMPLETED" />
<batch:fail on="FAILED" exit-code="FAILED" />
</batch:decision>
<batch:step id="jobStartStep" next="validateStep">
<batch:tasklet ref="jobStartTasklet" />
</batch:step>
Don't know whether this is an undocumented feature, or just some side of corner case. What would be the equivalent in Java Config?
I have chained a set of Spring batch jobs in an order.
<batch:job id="rootJob">
<batch:step id="rootJob.step1">
<batch:job ref="externalJob1">
<batch:next on="COMPLETE" to="rootJob.step2">
</batch:step>
<batch:split id="rootJob.step2">
<batch:flow>
<batch:step id="splitStep1">
<batch:job ref="externalJob2">
</batch:step>
</batch:flow>
<batch:flow>
<batch:step id="splitStep2">
<batch:job ref="externalJob3">
</batch:step>
</batch:flow>
<batch:next on="COMPLETE" to="rootJob.step3">
</batch:split>
<batch:step id="rootJob.step3">
<batch:job ref="externalJob4">
</batch:step>
</batch:job>
The expectation of job flow execution.
1. On Completion of rootJob.step1 execute rootJob.step2.
2. Execute splitJob1 and splitJob2 in parallel.
3. On Completion of rootJob.step2 execute rootJob.step3
But when deployed and triggered in Jboss. The flow is not executing as expected. The steps are getting triggered in single stretch. The execution is not waiting for previous step to complete and getting launched instantly.
I suspect the TaskExecutor. In standalone we do not specify any task executor (defaults to SyncTaskExecutor) and the job flow works fine. But when deployed in Jboss we use SimpleAsyncTaskExecutor, as using SyncTaskExecutor doesnt even trigger job in Jboss.
What am i missing here or Am i doing something wrong here.? Please suggest.
Resolved the issue.
I had provided the job-launcher="jobLauncher" property like below. So separate threads were launched and the jobs were triggering in parallel.
<batch:job ref="externalJob1" job-launcher="jobLauncher">
Now i have removed the joblauncher reference from all the jobs and the jobs are triggering as designed.
I have a job configuration where I load a set of files in parallel, after the set of files is loaded I also want to load another set of files in parallel, but only after the first set is completely loaded. The 2nd set has referential fields to the first set. I thought I can use a second split but never got it working, in the xsd it seems you can define more than one split and obviously a flow does not help me with my requirement.
So how do I define 2 sets of parallel flows which run in sequence to each?
<job>
<split>
<flow>
<step next="step2"/>
<step id="step2"/>
</flow>
<flow>
<step ...>
</flow>
</split>
<split ../>
Asoub was right, it is simply possible, I did a simple config and it worked. So seems the original issue I got has some other issue which causes problems when defining 2 splits.
Simple config I used:
<batch:job id="batchJob" restartable="true">
<batch:split id="x" next="y">
<batch:flow>
<batch:step id="a">
<batch:tasklet allow-start-if-complete="true">
<batch:chunk reader="itemReader" writer="itemWriter" commit-interval="2"/>
</batch:tasklet>
</batch:step>
</batch:flow>
<batch:flow>
<batch:step id="b">
<batch:tasklet allow-start-if-complete="true">
<batch:chunk reader="itemReader" writer="itemWriter" commit-interval="2"/>
</batch:tasklet>
</batch:step>
</batch:flow>
</batch:split>
<batch:split id="y" next="e">
<batch:flow>
<batch:step id="c">
<batch:tasklet allow-start-if-complete="true">
<batch:chunk reader="itemReader" writer="itemWriter" commit-interval="2"/>
</batch:tasklet>
</batch:step>
</batch:flow>
<batch:flow>
<batch:step id="d">
<batch:tasklet allow-start-if-complete="true">
<batch:chunk reader="itemReader" writer="itemWriter" commit-interval="2"/>
</batch:tasklet>
</batch:step>
</batch:flow>
</batch:split>
<batch:step id="e">
<batch:tasklet allow-start-if-complete="true">
<batch:chunk reader="itemReader" writer="itemWriter" commit-interval="2"/>
</batch:tasklet>
</batch:step>
</batch:job>
INFO: Job: [FlowJob: [name=batchJob]] launched with the following parameters: [{random=994444}]
Nov 23, 2016 11:33:24 PM org.springframework.batch.core.job.SimpleStepHandler handleStep
INFO: Executing step: [a]
Nov 23, 2016 11:33:24 PM org.springframework.batch.core.job.SimpleStepHandler handleStep
INFO: Executing step: [b]
Nov 23, 2016 11:33:24 PM org.springframework.batch.core.job.SimpleStepHandler handleStep
INFO: Executing step: [c]
Nov 23, 2016 11:33:24 PM org.springframework.batch.core.job.SimpleStepHandler handleStep
INFO: Executing step: [d]
Nov 23, 2016 11:33:24 PM org.springframework.batch.core.job.SimpleStepHandler handleStep
INFO: Executing step: [e]
Nov 23, 2016 11:33:25 PM org.springframework.batch.core.launch.support.SimpleJobLauncher run
INFO: Job: [FlowJob: [name=batchJob]] completed with the following parameters: [{random=994444}] and the following status: [COMPLETED]
As I said in comments, "So how do I define 2 sets of parallel flows which run in sequence to each?" doesn't make sense per se, you can't start two step in parrallel and sequentially.
Still I think you want to "start loading file2 in step2 when file1 in step1 as finished loading". Which means that loading a file occurs in the middle of a step. I see two way of solving this.
Let's say this is your configuration:
<job id="job1">
<split id="split1" task-executor="taskExecutor" next="step3">
<flow>
<step id="step1" parent="s1"/>
</flow>
<flow>
<step id="step2" parent="s2"/>
</flow>
</split>
<step id="step3" parent="s4"/> <!-- not important here -->
</job>
<beans:bean id="taskExecutor" class="org.spr...SimpleAsyncTaskExecutor"/>
But this will start both of your step in parrallel immediatly. You need to prevent the start of step 2. So, you need to use a Delegate in your step2's reader that will immediatly stop from loading file2, and waits for a signal to start the reading. And somewhere in the code of the step1, where you consider loading to be done, you launch a signal to step2's delegate reader to start loading file2.
The second solution is: you create your own SimpleAsyncTaskExecutor which will start step1 and wait for the signal from step1 to start step2. It's basically the first solution, but you wait for the signal in your custom Executor rather than in a Delegate reader. (you can copy source code from SimpleAsyncTaskExecutor to get an idea)
This comes at a cost, if the step1 never reaches the part where it signal step2 to start loading, your batch will hang forever. Maybe an exception in loading could cause this. As for signal mechanisms, Java has a lot of way to do this (wait() and notifiy(), locks, semaphore, non-standard library maybe).
I don't think there is some king of parrallel step trigger in spring batch (but if there is, someone posts it).
I've already answered a little while asking to your question, you need 2 splits: the first one loads the set of files A, and second, set of files B.
<job id="job1">
<split id="splitForSet_A" task-executor="taskExecutor" next="splitForSet_B">
<flow><step id="step1" parent="s1"/></flow>
<flow><step id="step2" parent="s2"/></flow>
<flow><step id="step3" parent="s3"/></flow>
</split>
<split id="splitForSet_B" task-executor="taskExecutor" next="stepWhatever">
<flow><step id="step4" parent="s4"/></flow>
<flow><step id="step5" parent="s5"/></flow>
<flow><step id="step6" parent="s6"/></flow>
</split>
<step id="stepWhatever" parent="sx"/>
</job>
Steps 1, 2 and 3 will run in parrallel (and load fileset A), then, once they're all over, the second split (splitForSet_B) will start and run steps 4, 5 and 6 in parrallel. A split is basicaly a step that contains steps running in parrallel.
You just need to specify in each steps what file you will be using (so it will be different for steps in first split from steps in second split.
I'd use two partitioned steps. Each partitioner would be responsible for identifying the files in its respective set for the concurrent child-steps to process
<job>
<step name="loadFirstSet">
<partition partitioner="firstSetPartitioner">
<handler task-executor="asyncTaskExecutor" />
<step name="loadFileFromSetOne>
<tasklet>
<chunk reader="someReader" writer="someWriter" commit-interval="#{jobParameters['commit.interval']}" />
</tasklet>
</step>
</partition>
</step>
<step name="loadSecondSet">
<partition partitioner="secondSetPartitioner">
<handler task-executor="asyncTaskExecutor" />
<step name="loadFileFromSecondSet>
<tasklet>
<chunk reader="someOtherReader" writer="someOtherWriter" commit-interval="#{jobParameters['another.commit.interval']}" />
</tasklet>
</step>
</partition>
</step>
</job>
I have deployed a restartable job in Spring XD, and it is FAILED due to some errors. But I am not able to restart the job from the admin console. Did I miss any configurations?
My job configuration looks as below.
<batch:job id="testjob" xmlns="http://www.springframework.org/schema/batch" restartable = "true">
<batch:step id="taskOne" next="taskTwo">
<batch:tasklet ref="task1" />
</batch:step>
<batch:step id="taskTwo" next="taskThree">
<batch:tasklet ref="task2" />
</batch:step>
<batch:step id="taskThree">
<batch:tasklet ref="task3" />
</batch:step>
</batch:job>
It looks like your job's Batch Status makes it not restartable even though it's ExitStatus is failed. Do you see this on your log:
"Encountered fatal error executing job"
I have just tried to restart a restartable job whose ExitStatus was failed and I could re-start the failed job.
Please see the differences between BatchStatus and ExitStatus:
Difference between Batch Status and Exit Status in Spring Batch
I'm using Spring Batch for the first time. I tried some examples and read through documentation. But I have still questions:
Can I skip one phase in chunk oriented processing? For example: I fetch data from database, process it and determine, that I need more, can I skip write phase and execute next step's read phase? Should I use Tasklet instead?
How to implement a conditional flow?
Thank you very much,
Florian
Skip chunks simply by throwing an exception that has been declared as "skippable exception". You can do it as follows:
<step id="step1">
<tasklet>
<chunk reader="reader" writer="writer"
commit-interval="10" skip-limit="10">
<skippable-exception-classes>
<include class="com.myapp.batch.MyException"/>
</skippable-exception-classes>
</chunk>
</tasklet>
</step>
Conditional flow can easily be implemented deciding on the ExitStatus of a step-execution:
<job id="job">
<step id="step1" parent="s1">
<next on="*" to="stepB" />
<next on="FAILED" to="stepC" />
</step>
<step id="stepB" parent="s2" next="stepC" />
<step id="stepC" parent="s3" />
</job>
Read the documentation to gain deeper knowledge on these topics: http://docs.spring.io/spring-batch/reference/html/configureStep.html