How to write multiple record types to a single flat file in Spring Batch - spring

I have an xml with multiple element types. Below is a simple xml for prototype.
The actual xml has more element types.
<company>
<FileHeader>
<fh_custId>Id</fh_custId>
<fh_custName>Name</fh_custName>
<fh_custAge>Age</fh_custAge>
<fh_dob>DOB</fh_dob>
<fh_income>Income</fh_income>
</FileHeader>
<record refId="1001">
<name>John</name>
<age>31</age>
<dob>31/8/1982</dob>
<income>200,000</income>
</record>
</company>
I am using StaxEventItemReader with JAXB2Marshaller to read the xml. I want to write to a single fixed size flat file. A FlatFileItemWriter with below settings can only handle one element type - ( in this case the "record" element).
How can I configure other element types as well. For instance "FileHeader" element in the above xml.
<property name="lineAggregator">
<bean
class="org.springframework.batch.item.file.transform.FormatterLineAggregator">
<property name="fieldExtractor">
<bean
class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
<property name="names" value="refId, name, age, csvDob, income" />
</bean>
</property>
<property name="format" value="%-6.6s%-15.15s%-4.4s%-12.12s%-10.10s%"/>
</bean>
</property>

I think you can use a custom LineAggregator combined with a Classifier to dispatch to the right line aggregator.

Related

How can I turn contents of CSV file into XML using Spring batch?

I'm using Spring Batch 3.0.8.RELEASE. I want to read the contents of a CSV file and turn it into XML. I am familiar with reading CSV files in Spring Batch, but the behavior I've seen is "chunk" oriented processing....one line at a time, and I'm not sure this default behavior will work for me here.
Here is the CSV sample:
,WT4RT,AIG-00,694304F,9/1/2017,9/30/2017,"6,975.00",AIG-00201709,10/10/2017,USD,MC
,WT4RT,AIG-00,694317E,9/1/2017,9/30/2017,"2,583.80",AIG-00201709,10/10/2017,USD,MC
,WT4RT,AIG-00,694304G,9/1/2017,9/30/2017,"17,600.00",AIG-00201709,10/10/2017,USD,MC
,WT4RT,AIG-00,694304G,9/1/2017,9/30/2017,740,AIG-00201709,10/10/2017,USD,MC
I need to turn this data into the following XML format:
<?xml version="1.0" encoding="UTF-8"?>
<BillingAndCosting version="1.0">
<ControlArea>
<SenderId>CMS-BILLING-100</SenderId>
<WaterMark>92030293829329392030232</WaterMark>
<RecordCount>2</RecordCount>
<TimeStamp>2001-12-17T09:30:47-05:00</TimeStamp>
<DataFileName>String</DataFileName>
</ControlArea>
<DataArea>
<CustomerAccount>
<ExternalKey>1001</ExternalKey>
<ExternalSource>HBVenture</ExternalSource>
<BillingData>
<ReferenceID>0</ReferenceID>
<BillingInvoiceNumber>2000016</BillingInvoiceNumber>
<BillingInvoiceDate>2017-01-31T06:42:07.000Z</BillingInvoiceDate>
<BillingPeriodFromDate>2017-01-01T06:42:07.000Z</BillingPeriodFromDate>
<BillingPeriodThruDate>2017-01-31T06:42:07.000Z</BillingPeriodThruDate>
<BillingInvoiceType>NEW</BillingInvoiceType>
<BillingAmount CurrencyID="USD">1290.39</BillingAmount>
<InvoiceItem>
<CategoryCode>res-group</CategoryCode>
<TaxCategoryID>C2</TaxCategoryID>
<InvoiceItemAmount CurrencyID="USD">1290.39</InvoiceItemAmount>
<ProductID>694601F</ProductID>
<ISVUID>1</ISVUID>
</InvoiceItem>
</BillingData>
<BillingData>
<ReferenceID>0</ReferenceID>
<BillingInvoiceNumber>2000017</BillingInvoiceNumber>
<BillingInvoiceDate>2017-01-31T06:42:07.000Z</BillingInvoiceDate>
<BillingPeriodFromDate>2017-01-01T06:42:07.000Z</BillingPeriodFromDate>
<BillingPeriodThruDate>2017-01-31T06:42:07.000Z</BillingPeriodThruDate>
<BillingInvoiceType>NEW</BillingInvoiceType>
<BillingAmount CurrencyID="USD">590.39</BillingAmount>
<InvoiceItem>
<CategoryCode>gateway_resource_group</CategoryCode>
<TaxCategoryID>C2</TaxCategoryID>
<InvoiceItemAmount CurrencyID="USD">590.39</InvoiceItemAmount>
<ProductID>694601F</ProductID>
<ISVUID>1</ISVUID>
</InvoiceItem>
</BillingData>
I'm showing only a portion of the XML for brevity. The thing I don't know is: how to use Spring Batch to read the entire CSV file to populate an Object that I can then send to the XML Marshaller for converting into XML.
Can it be done with Spring batch, or do I have to 'roll my own'?
We use StaxEventItemWriter as ItemWriter (example below)
<bean id = "xmlItemWriter"
class = "org.springframework.batch.item.xml.StaxEventItemWriter">
<property name = "resource" value = "file:my_path_to_xml.xml" />
<property name = "marshaller" ref = "reportMarshaller" />
<property name = "rootTagName" value = "BillingAndCosting" />
</bean>
<bean id = "reportMarshaller"
class = "org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name = "classesToBeBound">
<list>
<value>BillingAndCosting</value>
</list>
</property>
</bean>
You can use FlatFileItemReader for reading the CSV file. Follow the below sample code:
ItemReader reads a complete line one by one from input csv file. FlatFileItemReader is used for reading the csv file.
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="fieldSetMapper">
<!-- Mapper which maps each individual items in a record to properties in POJO -->
<bean class="com.websystique.springbatch.ExamResultFieldSetMapper" />
</property>
<property name="lineTokenizer">
<!-- A tokenizer class to be used when items in input record are separated by specific characters -->
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value="|" />
</bean>
</property>
</bean>
</property>
</bean>
XML ItemWriter which writes the data in XML format.
<bean id="xmlItemWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
<property name="resource" value="file:xml/examResult.xml" />
<property name="rootTagName" value="UniversityExamResultList" />
<property name="marshaller">
<bean class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>com.websystique.springbatch.model.ExamResult</value>
</list>
</property>
</bean>
</property>
</bean>
You should also go through the official Spring docs Spring Docs

read only selective columns from csv file using spring batch

there is one csv file having 100 columns, but we need only 3-5 columns which needs to be loaded into database.
I dont want to specify all the 100 columns in linetokenizer in job xml.
Please suggest how we can proceed in this case
Try using a custom fieldSetMapper. You can use it similar to a ResultSet with indexes.
You have to list all the column names only if you want automatic mapping.
Specify only the delimiter, in your case ","
<bean id="flatFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="resource" value="YOURFILE" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="fieldSetMapper">
<bean class="CUSTOMFIELDSETMAPPER" />
</property>
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value="," />
</bean>
</property>
</bean>
</property>
</bean>
The Custom Mapper could be something like this, say if you want to read the 1st column and 25th column:
public class CustomMapper implements FieldSetMapper<CustomPOJO>{
#Override
public CustomPOJO mapFieldSet(FieldSet fieldSet) throws BindException {
CustomPOJO result = new CustomPOJO();
result.setName(fieldSet.readString(0));
result.setAddress(fieldSet.readString(24));
return result;
}
}
For further explanation on how to use the reader, please refer to this tutorial

Dynamic value to Spring Property Ref tag

Any way is it possible to pass dynamic value to the ref tag attribute in below code from java?
<bean id="jobLauncherTestUtils" class="org.springframework.batch.test.JobLauncherTestUtils" >
<property name="job" ref="$(dynamicValue)"/>
<property name="jobLauncher" ref="jobLauncher"/>
<property name="jobRepository" ref="jobRepository" />
</bean>
`
You can load them from a file of properties
If the value should be changed at runtime you must build and launch at runtime.

How can I get the number of line and the filename in itemProcessor - spring batch

I am using spring batch to parse my files. In ItemProcessor I validate if the incoming fields are correct. If they are not I want to throw a ValidationException and log to a file the corresponding row which has the incorrect fields. So, how can I find the number of line and the filename in ItemProcessor?
Without seeing you ItemReader config I can't really be sure but if you are using something like FlatFileItemReader to parse a csv, if in strict mode it will validate the number of columns.
Assuming you reader looks like this, that is:
<bean id="iItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="linesToSkip" value="1"/>
<property name="comments" value="#" />
<property name="encoding" value="UTF-8"/>
<property name="lineMapper" >
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value=","/>
<property name="names">
<list >
<value>First_Field</value>
<value>Second_Field</value>
</list>
</property>
<property name="strict" value="true"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="uk.co.package.FieldSetMapper">
<property name="dateFormat" value="yyyy-MM-dd HH:mm:ss"/>
</bean>
</property>
</bean>
</property>
</bean>
It will throw a FlatFileParseException for any lines that can't be processed. This includes the line number and can be handled in a listener.
As for the line number, you might build your own LineMapper and then store the line-number in your business object. An example in which I store the line unprocessed (as-is) together with the line number:
DefaultLineMapper<OneRow> lineMapper = new DefaultLineMapper<OneRow>() {
#Override
public OneRow mapLine(String line, int lineNumber) throws Exception {
return new OneRow(lineNumber, line);
}
};
Of course you can already map your Object, I had the need to have the whole line unprocessed as input to my Processors.
As a reference with the same idea: https://stackoverflow.com/a/23770421/5658642

Is it possible to alias bean class names in Spring?

I have a string property which looks similar to the following example:
<property name="mappingData">
<list>
<bean class="com.company.product.longNamingStandard.migration.extractor.FieldMapping">
<property name="elementName" value="entitlement.user"/>
<property name="mapping" value="DocUsers"/>
</bean>
<bean class="com.company.product.longNamingStandard.migration.extractor.FieldMapping">
<property name="elementName" value="entitlement.contributor"/>
<property name="mapping" value="DocContributors"/>
</bean>
</list>
</property>
The long class name(s) effect readability & also create a refactoring overhead.
Is it possible to alias the class name and use a short name to declare the beans? Or is there an alternate best practice I'm missing?
Probably a bit late for you, but hopefully useful for others:
You can use parent beans to accomplish this.
First declare a parent bean as a template:
<bean id="FieldMapping" class="com.company.product.longNamingStandard.migration.extractor.FieldMapping"/>
Then use it elsewhere, using the parent attribute.
<property name="mappingData">
<list>
<bean parent="FieldMapping">
<property name="elementName" value="entitlement.user"/>
<property name="mapping" value="DocUsers"/>
</bean>
<bean parent="FieldMapping">
<property name="elementName" value="entitlement.contributor"/>
<property name="mapping" value="DocContributors"/>
</bean>
</list>
</property>
Please note my convention here is to use upper case id's here for the parent template beans.
each <bean/> comes with an attribute of name and id to help you reference those beans later in your configuration.
I would suggest using the id for declaring the bean.
your config could look like:
<bean id="fooBean" class="com.example.foo"/>
<bean id="barBean" class="com.example.bar"/>
<list>
<ref>fooBean</ref>
<ref>barBean</ref>
</list>
You may try to represent your mapping in some short form, and then convert it to the list of FieldMappings. For example, mappings from your snippet may be represented as a map.
As a theoretic exercise in Spring 3 you can do this with Spring Expression Language (if FieldMapping has the apropriate constructor):
<util:map id = "m">
<entry name = "entitlement.user" value = "DocUsers" />
<entry name = "entitlement.contributor" value = "DocContributors" />
</util:map>
...
<property name = "mappingData"
value = "#{m.![new com.company.product.longNamingStandard.migration.extractor.FieldMapping(key, value)]}" />
If this expression is too obscure, you may implement a FactoryBean to take a short form of your mapping data (for example, a map, as in this example) and return a configured list of FieldMappings:
<property name = "mappingData">
<bean class = "FieldMappingListFactoryBean">
<property name = "mappings">
<map>
<entry name = "entitlement.user" value = "DocUsers" />
<entry name = "entitlement.contributor" value = "DocContributors" />
</map>
</property>
</bean>
</property>
However, if your field mappings are some kind of reusable DSL, you may try to think about implementing a namespace extension.
I found a way to simulate an effect similar to a "import com.Foo;" in java code. The best option I could find was to use a PropertyPlaceholderConfigurer with local properties defined. Using your example, here's the configuration that you would put at the top of your spring config file to define a "class_FieldMapping" property:
<bean
class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
<description>Define properties equivalent to "import foo;" in java source</description>
<property name="properties">
<props>
<prop key="class_FieldMapping">com.company.product.longNamingStandard.migration.extractor.FieldMapping</prop>
</props>
</property>
</bean>
Then, you can use that property within your beans:
<property name="mappingData">
<list>
<bean class="${class_FieldMapping}">
...
</bean>
<bean class="${class_FieldMapping}">
...
</bean>
</list>
</property>
This has the benefit that use can also use it for things where you actually need the class name, and can't reference an instance of an object:
<util:constant static-field="${class_FieldMapping}.MYSTATICVAR" />
Why not declare those inner beans as separate top-level beans with their own names, and then reference them in the list ?
If I use PropertyPlaceholderConfigurer it leads to several exceptions in debug log. It works, but it seems it doesn't work on the first try.

Resources