I've never used Spring Batch before but it seems like a viable option for what I am attempting to accomplish. I have about 15 CSV files for 10 institutions that I need to process nightly. I am stashing the CSV into staging tables in an Oracle database.
The CSV File may look something like this.
DEPARTMENT_ID,DEPARTMENT_NAME,DEPARTMENT_CODE
100,Computer Science & Engineering,C5321
101,Math,M333
...
However when I process the row and add it to the database I need to fill in an institution id which would be determined based on the folder being processed at that time.
The database table would like like this
INSTITUTION_ID,DEPARTMENT_ID,DEPARTMENT_NAME,DEPARTMENT_CODE
1100,100,Computer Science & Engineering,C5321
There is also validation that needs to be done on each row in the CSV files as well. Is that something Spring Batch can handle as well?
I've seen reference to CustomItemReader and CustomItemWriter but not sure if that is what I need. The examples I've seen seem basic just dumping a CSV exactly as it is into a matching table.
Yes , all the task that you have reported can be done by spring batch -
For the Reader you may use - multi Resource Item Reader with your wild card name matching your - file names .
To validate the rows from file you can use item processor and handle the validation.
And for your case you need not use the custom item writer - you can configure the item writer as DB item writer in your XML file.
I suggest you to use the XML based approach for Spring batch implementation.
The XML will be used to configure all the architecture of your batch - as in
job -- step -- chunk -- reader -- processor -- writer
and to track errors and exceptions you can implement listeners at each stage.
-- step Execution Listener
-- Item Reader Listener
-- Item Processor Listener
-- Item Writer Listener
Related
I have about 20 csv files each of them representing one DB table. I tried to use Spring Batch as example to load one table and it was fine. One single job, with one single step composed by: a reader, a processor and a writer. Anyway each bean in the definition is casted to the Entity representing the table. So using this approach, I think it is not feasible to load 20 tables. Is there a way to have a generic reader (with associated mapper), processor and writer (with the corresponding list of columns)? Or is there a smarter way to load such files in the database? Thanks for the helps
I am working on a spring batch application where I read from a stored procedure from database and write it to an xml file.
My writer is a org.springframework.batch.item.xml.StaxEventItemWriter
I am trying to implement a situation in which I find duplicates using this method - Spring Batch how to filter duplicated items before send it to ItemWriter
However, in my situation I don't want to skip a duplicate but override an existing record written to XML by my ItemWriter.
How can I achieve it ?
I read everywhere how to read data in spring batch itemReader and write in database using itemWriter, but I wanted to just read data using spring batch then somehow I wanted to access this list of items outside the job. I need to perform remaining processing after job finished.
The reason I wanted to do this is because I need to perform a lot of validations on every item. I have to validate each item's variable xyz if it exists in list(which is not available within job). After performing a lot of processing I have to insert information in different tables using JPA. Please help me out!
We have a use case where we receive data in flat files which we load into an Oracle DB using Spring Batch. Post data load in Oracle, we have to distribute the data in form of flat files to several consumers. The data selection criteria depends on some pre-decided values in some fields of the data.
We have a design in place which generates a list which contains objects that can be passed to a Spring Batch job as job parameter to generate the flat files needed to be sent to the data consumers.
Using a Splitter component, I can put the individual objects into a channel and plug a JobLaunchingGateway to launch a batch job to generate the flat file.
Need help on how I can launch multiple batch jobs in parallel using JobLaunchingGateway so that I can generate files in parallel.
A setup is already in place to FTP the files to consumers. We do not need to worry about that.
Use an ExecutorChannel with a task executor before the JobLaunchingGateway.
I have a requirement to write multiple files using Spring Batch. The first file will be written based on the data from the database table. The second file will contain just the number of records written to the first file. How can I create the second file? I am not sure whether org.springframework.batch.item.file.MultiResourceItemWriter is an option for me as I think it will write multiple files based on the data it will write chunks of data in the multiple files. Correct me if I am wrong here.
Please do suggest some options with sample code if possible.
You have couple of options:
You can use CompositeItemWriter which calls collection of item writers in defined order so you can define one item writer which will write records based on data from DB and second will count records and write that to another file.
You can write data to a file in first step, finish whole file and save it somewhere, you can save counter of records if that is all you need to StepContext (common batch patterns and scroll to 11.8 Passing Data to Future Steps) and read in new Taskletcounter and save to new file.
If you want to go with option 1 which I think is right choice you can check this example of batch job configuration with CompositeItemWriter