My Batch reads data from one table, processes and writes to another table.
I have MyBatisPagingITemReader, Custom Processor and Writer.
Currently Custom Writer INSERTS the data which is converted in processor and Writer does a BATCH INSERT to the other table.
Now, Reader will read some rows which has to be updated in the other table.In this case my writer should also be capable of batch updating those records.
What is the best way to implement it?
Here is where iam stuck
My Writer
public void write(final List unmodifiableItems) throws Exception {
// unmodifiable items will be a list of row to be inserted.....
}
How will i access the list of records which needs to be UPDATED?
Related
Here is my scenario:
a1. read records from table A
a2. process these records one by one and generate a new temp table B for each record
b1. read records from table B, process these records data and save it in a file
a3. tag the record from table A as finished status
A pseudo code to describe this scenario:
foreach item in items:
1. select large amount data where id=item.id then save the result to temp table_id
2. process all records in table_id then write then to a file
3. update item status
4. send message to client
This is my design:
create a Spring Batch job, set a date as its parameter
create a step1 to read records from table A
create a step2 to read records from temporary table B and start it in the processor of step1
I check the Spring Batch docs, I didn't find any related introduction about how to nest a step into a step's processor. seems the Step is the minimum unit in Spring Batch and it cannot be split.
Update
Here is the pseudo code about what I did now to solve the problem:
(I'm using spring boot 2.7.8)
def Job:
PagingItemReader(id) :
select date from temp_id
FlatFileItemWriter:
application implement commandlinerunner:
items = TableAReposiroy.SelectAllBetweenDate
for item : items:
Service.createTempTableBWithId(item.id)
Service.loadDataToTempTable(item.id)
job = createJob(item.id)
luancher.run(job)
update item status
A step is part of a job. It is not possible to start a step from within an item processor.
You need to break your requirement into different steps, without trying to "nest" steps into each others. In your case, you can do something like:
create a Spring Batch job, set a date as its parameter
create a step1 to read records from table A and generate a new temp table B
create a step2 to read records from temporary table B, process these records data and save it in a file and tag the record from table A as finished status
The writer of step2 would be a composite writer: it writes data to the file AND updates the status of processed records in table A. This way, if there is an issue writing the file, the status of records in table A would be rolled back (in fact, they are not processed correctly as the write operation failed, in which case they need to be reprocessed).
You should have single job with two steps, stepA and stepB, spring batch does provide provision for controlled flow of execution of steps, you can sequentially execute two steps. For each item Once stepA reaches its writer and writes data, stepB will start. You can configure stepB to read data written by stepA.
You can also pass data between steps using Job Execution context, Once stepA ends put data in Job Execution context, it can be accessed in stepB once it starts. This can help in your case because you can pass item identifier which stepA picked for processing and pass it to stepB so that stepB can have this identifier in its writer to update its final status.
My current Spring Boot application runs a scheduler job with Spring Batch configuration where I have a FlatFileItemReader for reading the CSV rows, and a simple ItemWriter.
FlatFileItemReader<MyCsvRowDto>
ItemWriter<MyCsvRowDto>
Based on the chunk setup the CSV rows are red 1-by-1, and the writer gets all the data in a list.
I need to extend this logic to be able to read the rows from CSV and additionally a few things from repositories.
ItemReader<MyData>
ItemWriter<MyData>
where MyData contains the rows from CSV and additional things from repositories:
public class MyData {
private MyDatabaseData dbData;
private List<MyCsvRowData> csvData;
}
I am wondering if it is possible to do it still with FlatFileItemReader somehow, or I need to write a custom ItemReader where I read the data from repositories and then separately the CSV rows with supercsv?
I have the job which is being scheduled everyday.
The job functionality is below:
Reader will read the data from database using JDBCPaging reader.
ItemProcessor will process the data by making an API call which returns some updated data.
Writer writes the data back into database.
The problem is there is an online processing which reads a particular row, process and update it.
I want to maintain consistency such that I want to update the data which is last processed.
Since the reader, processor and write is done in separate method, how to take the lock and process it.
I am using postgresql database.
I have a Step in my job that reads from database A and then writes to database B & C.
If the select statement yields no results i want to expect it to continue to the processor and writer as usual. However, the writer() is not called!
This is because my writer is a Composite item writer which has a writer that updates a control table (database C) to say the reader read no results.
I would obviously have a new Tasklet Step to follow this Step in question, but its a partitioned step.
Is there a configuration property for the Job that allows empty reads to not be marked as 'NOOP' or similar, but as successful?
You should be able to use a StepExecutionListener for this use case instead of an ItemWriter. Within that StepExecutionListner#afterStep you can look at the items read count and if it's 0, do that db update. The writer piece is an ItemWriter, meaning it is intended to be used to write items that have been read.
Create a custom ItemReader that return a sentinel item if no items are read.
Add a custom ItemWriter mapped to sentinel item class where you update the control table.
I am working on a spring batch application where I read from a stored procedure from database and write it to an xml file.
My writer is a org.springframework.batch.item.xml.StaxEventItemWriter
I am trying to implement a situation in which I find duplicates using this method - Spring Batch how to filter duplicated items before send it to ItemWriter
However, in my situation I don't want to skip a duplicate but override an existing record written to XML by my ItemWriter.
How can I achieve it ?