It appears to me that out of the box Spring Cloud Composed Task does not support passing parameters between tasks.
Can you please suggest some options for the below requirement?
a) I have a composed task in which downloader is the first task, once that completes then two more tasks say Item and Item Group runs. Once those completes, a transformation takes place.
b) I need to run the above composed task for different store (E.g store no 1, 2 etc...)
Even if we are using database to pass the parameters b/n task what's the unique id we can use to relate the composed task.
Related
let me discribe shortly what I want and what I - maybe - know.
I want spring-batch to run a async job; in future more jobs.
The job gets two parameters: an external id and a year.
The job should be able to be restarted after completion because the user wants to run a job with the same parameters again and again.
Only one job should be executed with the same parameters at the same time.
From outside (web interface) it should be possible to query if a job is running by job name and parameters.
The querier could be different from the job starter so an instance or execution id is not present.
I know that a job instance is the representation of the job(name) and the parameters and - like you commented - I cannot rerun a job with the same parameters if the instance/execution is marked completed - except I use a incrementer.
But this changes the parameters by adding a run.id. Now a job is restartable but I and sping-batch itself are not able to identify a running job instance (by name and original parameters) anymore because every job run results in a new instance.
And the question "why would one would restart a successfully completed job instance?" is easy to answer: The user outside don't know about job/instance/execution. The user will start some data processing for a year again and again. And it's my task to make it possible :).
So it would be nice if spring-batch can let the user know "the job with your original parameters is still running".
Question:
What would be a good solution for my needs?
I didn't tried something but thought about it. Maybe I can write an own JobDao for my query? But this will not solve the run-instance-at-same-time problem. Or I can customize the JdbcJobInstanceDao or SimpleJobRepository? Maybe I must add a own job_key which contains only the original parameters?
To correctly understand the answer I am going to give to your question, it is important to know the difference and understand the relation between a job, a job instance and a job execution in Spring Batch. The The Domain Language of Batch section of the reference documentation explains that in details with examples.
The job should be able to be restarted after completion.
This is not possible by design, or more precisely, a job instance cannot be restarted after completion by design (Think of it like "why would one would restart a successfully completed job instance?").
From outside (web interface) it should be possible to query if an instance is running by job name and parameters. There querier could be different from the job starter so an instance or execution id is not present.
The JobExplorer is the API you are looking for. You can ask for job instances and job executions as needed.
Question: What would be a good solution for my needs?
In your case, you receive an external ID and a year as a job execution request. Those two parameters can be used as identifying parameters to define job instances. With this in place, if a job instance is failed, you can restart it by using the same parameters.
I see no need for an incrementer in your case. The incrementer is useful for jobs for which the instances can be defined as a "sequence" that can be "incremented". I see no need to create a custom DAO or JobRepository neither, you should be able to implement your requirement with the built-in components by correctly defining what a job instance is.
For my use-case I have to check if a execution for a job/parameters-combination is running. The parameters here are without run.id of an incrementor. This check must be done before a job run and by explicit rest call. Normally spring-batch checks for running executions but because of the used incrementor every job instance is unique and it will never find any.
So I created a bean with a check method and made use of jobExplorer.findRunningJobExecutions(jobName);. The result can then compared with the used paramters by iterating over JobExecution.getJobParameters().getParameters().
The bean can be used in the rest-method and in an own implemention of JobLauncher.run().
Another solution would be to store the increment separately for a job/parameters-combination. But I don't want to do this not least because I think a framework like spring-batch should do this for me or supports me by reusing/restarting a completed job instance.
I have a requirement where 3 different file will be loaded to a single table with 3 different PIPE. I want target my target process to be triggered only once all 3 file has been loaded to my stage.
I don't want to run my target process multiple times.
So is there any way we can have start condition of task on PIPE sucess.
I went to documentation but didn't find any such info or is there way of implementing it which I might be missing.
The general way to implement this pattern is with streams. Your pipes would load to three separate tables, each with a stream on it. You can then have a task that runs on a schedule, with the WHEN parameter set with SYSTEM$STREAM_HAS_DATA, three times. This ensures that your TASK only runs when all three pipes have completed successfully. Example:
CREATE TASK mytask1
WAREHOUSE = mywh
SCHEDULE = '5 minute'
WHEN
SYSTEM$STREAM_HAS_DATA('MYSTREAM') AND SYSTEM$STREAM_HAS_DATA('MYSTREAM2')
AND SYSTEM$STREAM_HAS_DATA('MYSTREAM3')
AS
<Do stuff.>;
You have a couple options here. You can:
use the data in the streams to do whatever you want to in the task, or
you can use the data in the streams to fill the single table that the three pipes were originally filling.
If you choose option 1, you might then also want to create a view that replaces your original single table.
If you choose option 2, you can set up a task that runs using the AFTER clause to do whatever it is that you want to do.
Please let me know , when i am putting say 5 files in a directory , 5 messages gets generated by the poller , i want that the spring batch job will get triggered only one time, not five times ,if the files are coming together say within 1 min duration. is it possible?
You may consider to use an Aggregator for this kind of task. So, you will collect several files together by expected size or withing some time window. You need to use some static correlationKey to let the component to group files.
When the group is ready, a single message is emitted and you are good to trigger a Batch job for this set of files.
I have 5 different task these task must be executed parallely. This task implementation in 5 different classes.Now I need to invoke these 5 classes parallel. Also number of times task executed will differ for each invocation.
Lets say I have ProcessExecuoter class it will provide list of task needs to be executed.
//This list will change dynamically each invocation
List myTaskList = new ArrayList();
Based on some property value in MyTask I need to invoke corresponding TaskClass and collect the results.
I am using Spring Boot 1.2.4 and Java 1.6.
You need just send your tasks as payloads of messages to the ExecutorChannel and gather their result afterwards using Aggregator component.
All the necessary info you can find in the Reference Manual:
https://docs.spring.io/spring-integration/docs/4.3.12.RELEASE/reference/html/messaging-channels-section.html#executor-channel
https://docs.spring.io/spring-integration/docs/4.3.12.RELEASE/reference/html/messaging-routing-chapter.html#aggregator
I am using a while activity for creating multiple tasks for a workflow. The code is executed fine and the task is created when the loop runs only once. But when the loop runs twice or more, only one task is getting created. Also the WF status shows as Error Occured.
All I want to do here is create multiple tasks (no of tasks depends on an entered column value) for the same user. Is it posible to use 'while' in this scenario? Or is there any other way to go ahead?
NB: I am using state machine workflow.
You may want to use a Replicator Activity which will in turn "clone" its child-activities. It can be run parallel or sequentially.
I found Working with the Replicator Activity and an Until Condition useful.
Otherwise without the Replicator, there is just one Task Activity.
In either case, make sure to assign a new Guid to the TaskId property. However, as an annoying "feature": it will not work if you just assign the TaskId property (I know, I tried and was like "Wth?!?"). Instead, bind the TaskId to a Field/Property and then assign to that.