How can I get the number of line and the filename in itemProcessor - spring batch - spring

I am using spring batch to parse my files. In ItemProcessor I validate if the incoming fields are correct. If they are not I want to throw a ValidationException and log to a file the corresponding row which has the incorrect fields. So, how can I find the number of line and the filename in ItemProcessor?

Without seeing you ItemReader config I can't really be sure but if you are using something like FlatFileItemReader to parse a csv, if in strict mode it will validate the number of columns.
Assuming you reader looks like this, that is:
<bean id="iItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="linesToSkip" value="1"/>
<property name="comments" value="#" />
<property name="encoding" value="UTF-8"/>
<property name="lineMapper" >
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value=","/>
<property name="names">
<list >
<value>First_Field</value>
<value>Second_Field</value>
</list>
</property>
<property name="strict" value="true"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="uk.co.package.FieldSetMapper">
<property name="dateFormat" value="yyyy-MM-dd HH:mm:ss"/>
</bean>
</property>
</bean>
</property>
</bean>
It will throw a FlatFileParseException for any lines that can't be processed. This includes the line number and can be handled in a listener.

As for the line number, you might build your own LineMapper and then store the line-number in your business object. An example in which I store the line unprocessed (as-is) together with the line number:
DefaultLineMapper<OneRow> lineMapper = new DefaultLineMapper<OneRow>() {
#Override
public OneRow mapLine(String line, int lineNumber) throws Exception {
return new OneRow(lineNumber, line);
}
};
Of course you can already map your Object, I had the need to have the whole line unprocessed as input to my Processors.
As a reference with the same idea: https://stackoverflow.com/a/23770421/5658642

Related

How can I turn contents of CSV file into XML using Spring batch?

I'm using Spring Batch 3.0.8.RELEASE. I want to read the contents of a CSV file and turn it into XML. I am familiar with reading CSV files in Spring Batch, but the behavior I've seen is "chunk" oriented processing....one line at a time, and I'm not sure this default behavior will work for me here.
Here is the CSV sample:
,WT4RT,AIG-00,694304F,9/1/2017,9/30/2017,"6,975.00",AIG-00201709,10/10/2017,USD,MC
,WT4RT,AIG-00,694317E,9/1/2017,9/30/2017,"2,583.80",AIG-00201709,10/10/2017,USD,MC
,WT4RT,AIG-00,694304G,9/1/2017,9/30/2017,"17,600.00",AIG-00201709,10/10/2017,USD,MC
,WT4RT,AIG-00,694304G,9/1/2017,9/30/2017,740,AIG-00201709,10/10/2017,USD,MC
I need to turn this data into the following XML format:
<?xml version="1.0" encoding="UTF-8"?>
<BillingAndCosting version="1.0">
<ControlArea>
<SenderId>CMS-BILLING-100</SenderId>
<WaterMark>92030293829329392030232</WaterMark>
<RecordCount>2</RecordCount>
<TimeStamp>2001-12-17T09:30:47-05:00</TimeStamp>
<DataFileName>String</DataFileName>
</ControlArea>
<DataArea>
<CustomerAccount>
<ExternalKey>1001</ExternalKey>
<ExternalSource>HBVenture</ExternalSource>
<BillingData>
<ReferenceID>0</ReferenceID>
<BillingInvoiceNumber>2000016</BillingInvoiceNumber>
<BillingInvoiceDate>2017-01-31T06:42:07.000Z</BillingInvoiceDate>
<BillingPeriodFromDate>2017-01-01T06:42:07.000Z</BillingPeriodFromDate>
<BillingPeriodThruDate>2017-01-31T06:42:07.000Z</BillingPeriodThruDate>
<BillingInvoiceType>NEW</BillingInvoiceType>
<BillingAmount CurrencyID="USD">1290.39</BillingAmount>
<InvoiceItem>
<CategoryCode>res-group</CategoryCode>
<TaxCategoryID>C2</TaxCategoryID>
<InvoiceItemAmount CurrencyID="USD">1290.39</InvoiceItemAmount>
<ProductID>694601F</ProductID>
<ISVUID>1</ISVUID>
</InvoiceItem>
</BillingData>
<BillingData>
<ReferenceID>0</ReferenceID>
<BillingInvoiceNumber>2000017</BillingInvoiceNumber>
<BillingInvoiceDate>2017-01-31T06:42:07.000Z</BillingInvoiceDate>
<BillingPeriodFromDate>2017-01-01T06:42:07.000Z</BillingPeriodFromDate>
<BillingPeriodThruDate>2017-01-31T06:42:07.000Z</BillingPeriodThruDate>
<BillingInvoiceType>NEW</BillingInvoiceType>
<BillingAmount CurrencyID="USD">590.39</BillingAmount>
<InvoiceItem>
<CategoryCode>gateway_resource_group</CategoryCode>
<TaxCategoryID>C2</TaxCategoryID>
<InvoiceItemAmount CurrencyID="USD">590.39</InvoiceItemAmount>
<ProductID>694601F</ProductID>
<ISVUID>1</ISVUID>
</InvoiceItem>
</BillingData>
I'm showing only a portion of the XML for brevity. The thing I don't know is: how to use Spring Batch to read the entire CSV file to populate an Object that I can then send to the XML Marshaller for converting into XML.
Can it be done with Spring batch, or do I have to 'roll my own'?
We use StaxEventItemWriter as ItemWriter (example below)
<bean id = "xmlItemWriter"
class = "org.springframework.batch.item.xml.StaxEventItemWriter">
<property name = "resource" value = "file:my_path_to_xml.xml" />
<property name = "marshaller" ref = "reportMarshaller" />
<property name = "rootTagName" value = "BillingAndCosting" />
</bean>
<bean id = "reportMarshaller"
class = "org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name = "classesToBeBound">
<list>
<value>BillingAndCosting</value>
</list>
</property>
</bean>
You can use FlatFileItemReader for reading the CSV file. Follow the below sample code:
ItemReader reads a complete line one by one from input csv file. FlatFileItemReader is used for reading the csv file.
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="fieldSetMapper">
<!-- Mapper which maps each individual items in a record to properties in POJO -->
<bean class="com.websystique.springbatch.ExamResultFieldSetMapper" />
</property>
<property name="lineTokenizer">
<!-- A tokenizer class to be used when items in input record are separated by specific characters -->
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value="|" />
</bean>
</property>
</bean>
</property>
</bean>
XML ItemWriter which writes the data in XML format.
<bean id="xmlItemWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
<property name="resource" value="file:xml/examResult.xml" />
<property name="rootTagName" value="UniversityExamResultList" />
<property name="marshaller">
<bean class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>com.websystique.springbatch.model.ExamResult</value>
</list>
</property>
</bean>
</property>
</bean>
You should also go through the official Spring docs Spring Docs

Spring Batch - Last item from the reader alone is getting updated

I have to read from a file (FlatFile) and update a column if that ID present in the file matches the id in the column.The file is being read properly but only the last id value is getting updated here . Please find the snippet
Job-Config.xml
<bean id="abcitemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="resource" value="file:datafile/outputs/ibdData.txt" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="names" value="ID,NAM,TYPE" />
<property name="delimiter" value="|"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="com.pershing.intraware.springbatch.mapper.abcFieldsetMapper" />
</property>
</bean>
</property>
</bean>
<bean id="abcitemWriter" class="org.springframework.batch.item.database.JdbcBatchItemWriter" scope="step">
<property name="dataSource" ref="dataSource" />
<property name="sql"><value>UPDATE TEST_abc SET BIZ_ARNG_CD = CASE WHEN ID IN (SELECT ID FROM TEST_abc WHERE ID= ? and MONTH=(to_char(sysdate, 'MM')) AND YR =(to_char(sysdate, 'YY'))) THEN 'Y' ELSE 'N' END</value></property>
<!-- It will take care matching between object property and sql name parameter -->
<property name="itemPreparedStatementSetter" ref="testPrepStatementSetter" />
</bean>
</beans>
Setter.java
public class IDItemPreparedStatementSetter implements ItemPreparedStatementSetter<Test> {
#Override
public void setValues(Test item, PreparedStatement ps) throws SQLException {
// TODO Auto-generated method stub
ps.setString(1, item.getID());
}
}
Your query is updating each row of database every time it is fired. You need to restrict that. Currently; it must be setting the BIZ_ARNG_CD to 'Y' for records with ID equal to the ID of the last record passed to the writer.
You can fix this in 2 ways -
Default the database column to 'N' and don't set it to 'N' in the update statement
Add where clause in update script ( BIZ_ARNG_CD != 'Y')

read only selective columns from csv file using spring batch

there is one csv file having 100 columns, but we need only 3-5 columns which needs to be loaded into database.
I dont want to specify all the 100 columns in linetokenizer in job xml.
Please suggest how we can proceed in this case
Try using a custom fieldSetMapper. You can use it similar to a ResultSet with indexes.
You have to list all the column names only if you want automatic mapping.
Specify only the delimiter, in your case ","
<bean id="flatFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="resource" value="YOURFILE" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="fieldSetMapper">
<bean class="CUSTOMFIELDSETMAPPER" />
</property>
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value="," />
</bean>
</property>
</bean>
</property>
</bean>
The Custom Mapper could be something like this, say if you want to read the 1st column and 25th column:
public class CustomMapper implements FieldSetMapper<CustomPOJO>{
#Override
public CustomPOJO mapFieldSet(FieldSet fieldSet) throws BindException {
CustomPOJO result = new CustomPOJO();
result.setName(fieldSet.readString(0));
result.setAddress(fieldSet.readString(24));
return result;
}
}
For further explanation on how to use the reader, please refer to this tutorial

Reading in UTF-8 encoding in Java results in issues at only some places

I am reading a UTF_8 encoded file byte by byte in Java Spring Batch and saving the content in mongodb. The reading is delegated to FlatFileItemReader and the encoding is injected into the bean as UTF-8.
What I do see is that only at one or two specific places, the umlaut character is getting stored with an encoding problem in mongodb. I verified that the content in the original file is okay. What is interesting is that everywhere else, the umlaut character in the original file is interpreted correctly and stored correctly in mongodb.
Could you help with this?
<bean id="recordReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="resource" value="file:#{jobExecutionContext['inputfile']}" />
<property name="linesToSkip" value = "1" />
<property name= "encoding" value ="UTF-8"/>
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper" >
<property name = "lineTokenizer">
<bean class = "com.xyz.NoQuoteDelimitedLineTokenizer">
<property name = "strict" value = "false" />
<property name = "names" value = "Col1,Col2,Col3,Col4,Col5,Col6,Col7,Col8" />
<property name="delimiter">
<util:constant static-field="org.springframework.batch.item.file.transform.DelimitedLineTokenizer.DELIMITER_TAB" />
</property>
</bean>
</property>
<property name = "fieldSetMapper">
<bean class = "com.xyz.MyFieldSetMapper" >
<property name = "ctx" value="#{jobExecutionContext['contextBean']}"/>
</bean>
</property>
</bean>
</property>
</bean>
FieldSetMapper
private void validateFields (String id, ConsumerDO ci, FieldSet fieldset_p) throws Exception
{try {
ci.setCaption(fieldset_p.readString("newCaption"));
}
catch (Exception e) {
}
}

Spring static factory with factory-method and parameter

I have a problem transfering code to Spring applicationContext.xml
The source is:
File inFile = new File ("path/to/file/", "fileName.docx")
WordprocessingMLPackage wordMLPackage = Docx4J.load(inFile);
My not working solution is:
<bean id="inFile" class="java.io.File">
<constructor-arg value="path/to/file/" />
<constructor-arg value="fileName.docx" />
</bean>
<bean id="docx4j" class="org.docx4j.Docx4J" factory-method="load">
<constructor-arg ref="inFile" />
</bean>
<bean id="wordprocessingMLPackage" class="org.docx4j.openpackaging.packages.WordprocessingMLPackage" factory-bean="docx4j" />
What I'm getting out of the bean "wordprocessingMLPackage" is indeed an instance of the Class WordprocessingMLPackage, but it seems empty although the File I'm trying to load isn't (and yes, the path is doublechecked).
When trying
MainDocumentPart mdp = wordprocessingMLPackage.getMainDocumentPart();
List<Object> content = mdp.getContent();
I'm getting a NullPointerException because mdp is null!
Has anyone an idea... or even a solution?
============================================================
I found a solution especially for my problem.
Here is the source of Docx4j.load():
public static WordprocessingMLPackage load(File inFile) throws Docx4JException {
return WordprocessingMLPackage.load(inFile);
}
That means I can create an instance of WordprocessingMLPackage by its static self!
The code which is working:
<bean id="wordprocessingMLPackage" class="org.docx4j.openpackaging.packages.WordprocessingMLPackage" factory-method="load">
<constructor-arg ref="baseDocument" />
</bean>
So I found a lucky "workaround" for the original problem.
Since this question isn't urgent any more, I'm still interested in the correct solution, especially in a solution which allows injecting the WordprocessingMLPackage in other beans.
Thank you!
Here you need to make use of MethodInvokingFactoryBean as detailed below.
<bean id="beanId"
class="org.springframework.beans.factory.config.MethodInvokingFactoryBean">
<property name="targetClass" value="org.docx4j.Docx4J" />
<property name="targetMethod" value="load"/>
<property name="arguments">
<list>
<ref bean="inFile" />
</list>
</property>
</bean>
In your code get hold of applicationContext instance and invoke the below LOC
WordprocessingMLPackage ml = (WordprocessingMLPackage) applicationContext.getBean("beanId");
Let know in comments if you face any issues.
As Bond - Java Bond stated this works:
<bean id="inFile" class="java.io.File">
<constructor-arg value="path/to/file/" />
<constructor-arg value="fileName.docx" />
</bean>
<bean id="beanId" class="org.springframework.beans.factory.config.MethodInvokingFactoryBean">
<property name="targetClass" value="org.docx4j.Docx4J" />
<property name="targetMethod" value="load"/>
<property name="arguments">
<list>
<ref bean="inFile" />
</list>
</property>
</bean>
You can now use the bean as
WordprocessingMLPackage ml = (WordprocessingMLPackage) applicationContext.getBean("beanId");
or you can inject the bean directly as
<bean id="service" class="app.service.Service">
<property name="wordprocessingMLPackage" ref="beanId" />
</bean>
Thank you!!!

Resources