How to write custom flat file item reader using xml configuration - spring

i am new to spring batch. I am using flat file item reader, configured in xml file. then there is a processor which processes each obj created. I need to pre process contents of file before passing it to file item reader. The processed results/file should not be written to disk. may i know how to do it through xml file configuration.
is it though tasklet or extending flat file item reader? then the processor should work as before with no change. i need to introduce a layer before passing the file to flat file item reader.

You can use ItemReadListener for this. ItemReadListener has three callback methods.
beforeRead , afterRead and onReadError.
You can but your logic in beforeRead.
Sample Code for CustomItemReaderListener
public class CustomItemReaderListener implements ItemReadListener<Domain> {
#Override
public void beforeRead() {
System.out.println("ItemReadListener - beforeRead");
//I need to pre process contents of file before passing it to file item reader
// add this logic here
}
#Override
public void afterRead(Domain item) {
System.out.println("ItemReadListener - afterRead");
}
#Override
public void onReadError(Exception ex) {
System.out.println("ItemReadListener - onReadError");
}
}
Map listeners to Step in XML :
<step id="step1">
<tasklet>
<chunk reader="myReader" writer="flatFileItemWriter"
commit-interval="1" />
<listeners>
<listener ref="customItemReaderListener" />
</listeners>
</tasklet>
</step>

Related

Spring batch rollback all chunks & write one file at a time

I am a newbie in Spring batch and I have a couple of questions.
Question 1: I am using a MultiResourceItemReader for reading a bunch of CSV files and a JDBC Item writer to update the DB in batches. The commit interval is set to 1000. If there is a file with a 10k records and I encounter a DB error at the 7th batch is there any way I can roll back all the previously committed chunks?
Question 2: If there are two files each having 100 records and the commit interval is set to 1000 then the MultiResourceItemReader reads both files and sends it to the Writer. Is there any way we can just Write one file at a time ignoring the commit interval in this case essentially creating a loop in writer alone?
Posting the solution that worked for me in case someone need it for reference.
For Question 1 I was able to achieve it using the StepListenerSupport in the writer and overriding the BeforeStep and AfterStep. Sample snippet as below
public class JDBCWriter extends StepListenerSupport implements ItemWriter<MyDomain>{
private boolean errorFlag;
private String sql = "{ CALL STORED_PROC(?, ?, ?, ?, ?) }";
#Autowired
private JdbcTemplate jdbcTemplate;
#Override
public void beforeStep(StepExecution stepExecution){
try{
Connection connection = jdbcTemplate.getDataSource().getConnection();
connection.setAutoCommit(false);
}
catch(SQLException ex){
setErrorFlag(Boolean.TRUE);
}
}
#Override
public void write(List<? extends MyDomain> items) throws Exception{
if(!items.isEmpty()){
CallableStatement callableStatement = connection.prepareCall(sql);
callableStatement.setString("1", "FirstName");
callableStatement.setString("2", "LastName");
callableStatement.setString("3", "Date of Birth");
callableStatement.setInt("4", "Year");
callableStatement.registerOutParameter("errors", Types.INTEGER, "");
callableStatement.execute();
if(errors != 0){
this.setErrorFlag(Boolean.TRUE);
}
}
else{
this.setErrorFlag(Boolean.TRUE);
}
}
#Override
public void afterChunk(ChunkContext context){
if(errorFlag){
context.getStepContext().getStepExecution().setExitStatus(ExitStatus.FAILED); //Fail the Step
context.getStepContext().getStepExecution().setStatus(BatchStatus.FAILED); //Fail the batch
}
}
#Override
public ExitStatus afterStep(StepExecution stepExecution){
try{
if(!errorFlag){
connection.commit();
}
else{
connection.rollback();
stepExecution.setExitStatus(ExitStatus.FAILED);
}
}
catch(SQLException ex){
LOG.error("Commit Failed!" + ex);
}
return stepExecution.getExitStatus();
}
public void setErrorFlag(boolean errorFlag){
this.errorFlag = errorFlag;
}
}
XML Config:
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
....
http://www.springframework.org/schema/batch/spring-batch-3.0.xsd">
<job id="fileLoadJob" xmlns="http://www.springframework.org/schema/batch">
<step id="batchFileUpload" >
<tasklet>
<chunk reader="fileReader"
commit-interval="1000"
writer="JDBCWriter"
/>
</tasklet>
</step>
</job>
<bean id="fileReader" class="...com.FileReader" />
<bean id="JDBCWriter" class="...com.JDBCWriter" />
</beans>
Question 1: The only way to accomplish this is via some form of compensating logic. You can do that via a listener (ChunkListener#afterChunkError for example), but the implementation is up to you. There is nothing within Spring Batch that knows what the overall state of the output is and how to roll it back beyond the current transaction.
Question 2: Assuming you're looking for one output file per input file, due to the fact that most Resource implementations are non-transactional, the writers associated with them do special work to buffer up to the commit point and then flush. The problem here is that because of that, there is no real opportunity to divide that buffer to multiple resources. To be clear, it can be done, you'll just need a custom ItemWriter to do it.

How to get current file name and end of file in Item Processor in Spring Batch?

I am reading multiple .csv files from a location using multi file reader and I need names of the file in the ItemProcessor for all the input csv files.
Is there a way I can get to know in ItemProcessor that current file is completed?
you need to set the commitInterval value to 1 so the process will be relaunched with each untreated file in the directory
<bean id="simpleStep" class="org.springframework.batch.core.step.item.SimpleStepFactoryBean">
<property name="commitInterval" value="1" />
</bean>
you need to code your custom itemProcessor
public class CustomItemProcessor implements ItemProcessor {
public Object process(Object arg0) throws Exception {
File input = (File) arg0;
// treat your file here
System.out.println("File Name : " + input.getName ());
return input;
}
}

ItemReaderAdapter to Read Custom DAO

I have a requirement to use the spring batch to read the existing logic retrieved from database and the existing target object method returns me the list of objects after querying from database.
So I have a task to read this in chunks. When I see the list size from existing code, I see it is around 15000 but on implementing the spring batch, I wanted to read in chunks of 100 and this was not happening through ItemReaderAdapter.
Below code snippets would give you an idea of the issue I am mentioning. So would this be possible from Spring Batch. I notice the Delegating Job Sample Spring Example, but the service there returns the object on every chunk and not the total list object.
Please advice
Job.xml
<step id="firststep">
<tasklet>
<chunk reader="myreader" writer="mywriter" commit-interval="100" />
</tasklet>
</step>
<job id="firstjob" incrementer="idIncrementer">
<step id="step1" parent="firststep" />
</job>
<beans:bean id="myreader" class="org.springframework.batch.item.adapter.ItemReaderAdapter">
<beans:property name="targetObject" ref="readerService" />
<beans:property name="targetMethod" value="getCustomer" />
</beans:bean>
<beans:bean id="readerService" class="com.sh.java.ReaderService">
</beans:bean>
ReaderService.java
public class ReaderService {
public List<CustomItem> getCustomer() throws Exception {
/*
* code to get database instances
*/
List<CustomItem> customList = dao.getCustomers(date);
System.out.println("Customer List Size: " + customList.size()); //Here it is 15K
return (List<CustomItem>) customList;
}
}
Before all: read a 15K List<> of objects might impact (in negative) performance; check if you can write a custom SQL query and use a JDBC/Hibernate cursor item reader instead.
What you are trying to do is not possible using ItemReaderAdapter (it wasn't designed to read a chunk of object) but you can achieve the same result writing a custom ItemReader extending AbstractItemCountingItemStreamItemReader to inherit ItemStream capabilities and override the abstract or no-op methods; especially in:
doOpen() call your readerService.getCustomers() and save List<> in class variables,
in doRead() read next item - from List<> read in doOpen() - using built-in index stored in ExecutionContext
#Bellabax,
Doing the way you suggested also reads the entire database records in doOpen, however, from the list retrieved from doOpen, the reader reads it in a chunks. Pls advise
CustomerReader.java
public class CustomerReader extends
AbstractItemCountingItemStreamItemReader<Customer>
{
List<Customer> customerList;
public CustomerReader ()
{
}
#Override
protected void doClose() throws Exception
{
customerList.clear();
setMaxItemCount(0);
setCurrentItemCount(0);
}
#Override
protected void doOpen() throws Exception
{
customList = dao.getCustomers(date);
System.out.println("Customer List Size: "+list.size()); //This still prints 15k
setMaxItemCount(list.size());
}
#Override
protected Customer doRead() throws Exception
{
//Here reading 15K in chunks!
Customer customer = customList.get(getCurrentItemCount() - 1);
return customer;
}
}

Controlling Spring-Batch Step Flow in a java-config manner

According to Spring-Batch documentation (http://docs.spring.io/spring-batch/2.2.x/reference/html/configureStep.html#controllingStepFlow), controlling step flow in an xml config file is very simple:
e.g. I could write the following job configuration:
<job id="myJob">
<step id="step1">
<fail on="CUSTOM_EXIT_STATUS"/>
<next on="*" to="step2"/>
</step>
<step id="step2">
<end on="1ST_EXIT_STATUS"/>
<next on="2ND_EXIT_STATUS" to="step10"/>
<next on="*" to="step20"/>
</step>
<step id="step10" next="step11" />
<step id="step11" />
<step id="step20" next="step21" />
<step id="step21" next="step22" />
<step id="step22" />
</job>
Is there a simple way defining such a job configuration in a java-config manner? (using JobBuilderFactory and so on...)
As the documentation also mentions, we can only branch the flow based on the exit-status of a step. To be able to report a custom exit-status (possibly different from the one automatically mapped from batch-status), we must provide an afterStep method for a StepExecutionListener.
Suppose we have an initial step step1 (an instance of a Tasklet class Step1), and we want to do the following:
if step1 fails (e.g. by throwing a runtime exception), then the entire job should be considered as FAILED.
if step1 completes with an exit-status of COMPLETED-WITH-A, then we want to branch to some step step2a which supposedly handles this specific case.
otherwise, we stay on the main truck of the job and continue with step step2.
Now, provide an afterStep method inside Step1 class (also implementing StepExecutionListener):
private static class Step1 implements Tasklet, StepExecutionListener
{
#Override
public ExitStatus afterStep(StepExecution stepExecution)
{
logger.info("*after-step1* step-execution={}", stepExecution.toString());
// Report a different exit-status on a random manner (just a demo!).
// Some of these exit statuses (COMPLETED-WITH-A) are Step1-specific
// and are used to base a conditional flow on them.
ExitStatus exitStatus = stepExecution.getExitStatus();
if (!"FAILED".equals(exitStatus.getExitCode())) {
double r = Math.random();
if (r < 0.50)
exitStatus = null; // i.e. COMPLETED
else
exitStatus = new ExitStatus(
"COMPLETED-WITH-A",
"Completed with some special condition A");
}
logger.info("*after-step1* reporting exit-status of {}", exitStatus);
return exitStatus;
}
// .... other methods of Step1
}
Finally, build the job flow inside createJob method of our JobFactory implementation:
#Override
public Job createJob()
{
// Assume some factories returning instances of our Tasklets
Step step1 = step1();
Step step2a = step2a();
Step step2 = step2();
JobBuilder jobBuilder = jobBuilderFactory.get(JOB_NAME)
.incrementer(new RunIdIncrementer())
.listener(listener); // a job-level listener
// Build job flow
return jobBuilder
.start(step1)
.on("FAILED").fail()
.from(step1)
.on("COMPLETED-WITH-A").to(step2a)
.from(step1)
.next(step2)
.end()
.build();
}
Maybe. If your intentions are to write something similar to a flow decider "programmatically" (using SB's framework interfaces, I mean) there is the built-in implementation and is enough for the most use cases.
Opposite to XML config you can use JavaConfig annotations if you are familiar with them; personally I prefer XML definition, but it's only a personal opinion.

Spring Batch SkipListener not called when exception occurs in reader

This is my step configuration. My skip listeners onSkipInWrite() method is called properly. But onSkipInRead() is not getting called. I found this by deliberately throwing a null pointer exception from my reader.
<step id="callService" next="writeUsersAndResources">
<tasklet allow-start-if-complete="true">
<chunk reader="Reader" writer="Writer"
commit-interval="10" skip-limit="10">
<skippable-exception-classes>
<include class="java.lang.Exception" />
</skippable-exception-classes>
</chunk>
<listeners>
<listener ref="skipListener" />
</listeners>
</tasklet>
</step>
I read some forums and interchanged the listeners-tag at both levels: Inside the chunk, and outside the tasklet. Nothing is working...
Adding my skip Listener here
package com.legal.batch.core;
import org.apache.commons.lang.StringEscapeUtils;
import org.springframework.batch.core.SkipListener;
import org.springframework.jdbc.core.JdbcTemplate;
public class SkipListener implements SkipListener<Object, Object> {
#Override
public void onSkipInProcess(Object arg0, Throwable arg1) {
// TODO Auto-generated method stub
}
#Override
public void onSkipInRead(Throwable arg0) {
}
#Override
public void onSkipInWrite(Object arg0, Throwable arg1) {
}
}
Experts please suggest
Skip listeners respect transaction boundary, which means they always be called just before the transaction is committed.
Since a commit interval in your example is set to "10", the onSkipInRead will be called right at the moment of committing these 10 items (at once).
Hence if you try to do a step by step debugging, you would not see a onSkipInRead called right away after an ItemReader throws an exception.
A SkipListener in your example has an empty onSkipInRead method. Try to add some logging inside onSkipInRead, move a and rerun your job to see those messages.
EDIT:
Here is a working example [names are changed to 'abc']:
<step id="abcStep" xmlns="http://www.springframework.org/schema/batch">
<tasklet>
<chunk writer="abcWriter"
reader="abcReader"
commit-interval="${abc.commit.interval}"
skip-limit="1000" >
<skippable-exception-classes>
<include class="com.abc....persistence.mapping.exception.AbcMappingException"/>
<include class="org.springframework.batch.item.validator.ValidationException"/>
...
<include class="...Exception"/>
</skippable-exception-classes>
<listeners>
<listener ref="abcSkipListener"/>
</listeners>
</chunk>
<listeners>
<listener ref="abcStepListener"/>
<listener ref="afterStepStatsListener"/>
</listeners>
<no-rollback-exception-classes>
<include class="com.abc....persistence.mapping.exception.AbcMappingException"/>
<include class="org.springframework.batch.item.validator.ValidationException"/>
...
<include class="...Exception"/>
</no-rollback-exception-classes>
<transaction-attributes isolation="READ_COMMITTED"
propagation="REQUIRED"/>
</tasklet>
</step>
where an abcSkipListener bean is:
public class AbcSkipListener {
private static final Logger logger = LoggerFactory.getLogger( "abc-skip-listener" );
#OnReadError
public void houstonWeHaveAProblemOnRead( Exception problem ) {
// ...
}
#OnSkipInWrite
public void houstonWeHaveAProblemOnWrite( AbcHolder abcHolder, Throwable problem ) {
// ...
}
....
}
I come back on the subject after having had the same problem in more superior versions where the xml configuration is not used
With the bellow configuration , i was not able to reach the skip listener implementations.
#Bean
public Step step1( ) {
return stepBuilderFactory
.get("step1").<String, List<Integer>>chunk(1)
.reader(reader)
.processor(processor)
.faultTolerant()
.skipPolicy(skipPolicy)
.writer(writer)
.listener(stepListener)
.listener(skipListener)
.build();
}
The issue here is the placement of the skip listener is not correct.
The skip listener should be within the faultTolerantStepBuilder.
#Bean
public Step step1( ) {
return stepBuilderFactory
.get("step1").<String, List<Integer>>chunk(1)
.reader(reader)
.processor(processor)
.faultTolerant()
.listener(skipListener)
.skipPolicy(skipPolicy)
.writer(writer)
.listener(stepListener)
.build();
}
The first snippet is considered as listener within a simpleStepBuilder.

Resources