read only selective columns from csv file using spring batch

read only selective columns from csv file using spring batch - spring

there is one csv file having 100 columns, but we need only 3-5 columns which needs to be loaded into database.
I dont want to specify all the 100 columns in linetokenizer in job xml.
Please suggest how we can proceed in this case

Try using a custom fieldSetMapper. You can use it similar to a ResultSet with indexes.
You have to list all the column names only if you want automatic mapping.
Specify only the delimiter, in your case ","
<bean id="flatFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="resource" value="YOURFILE" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="fieldSetMapper">
<bean class="CUSTOMFIELDSETMAPPER" />
</property>
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value="," />
</bean>
</property>
</bean>
</property>
</bean>
The Custom Mapper could be something like this, say if you want to read the 1st column and 25th column:
public class CustomMapper implements FieldSetMapper<CustomPOJO>{
#Override
public CustomPOJO mapFieldSet(FieldSet fieldSet) throws BindException {
CustomPOJO result = new CustomPOJO();
result.setName(fieldSet.readString(0));
result.setAddress(fieldSet.readString(24));
return result;
}
}
For further explanation on how to use the reader, please refer to this tutorial

Related

How can I turn contents of CSV file into XML using Spring batch?

I'm using Spring Batch 3.0.8.RELEASE. I want to read the contents of a CSV file and turn it into XML. I am familiar with reading CSV files in Spring Batch, but the behavior I've seen is "chunk" oriented processing....one line at a time, and I'm not sure this default behavior will work for me here.
Here is the CSV sample:
,WT4RT,AIG-00,694304F,9/1/2017,9/30/2017,"6,975.00",AIG-00201709,10/10/2017,USD,MC
,WT4RT,AIG-00,694317E,9/1/2017,9/30/2017,"2,583.80",AIG-00201709,10/10/2017,USD,MC
,WT4RT,AIG-00,694304G,9/1/2017,9/30/2017,"17,600.00",AIG-00201709,10/10/2017,USD,MC
,WT4RT,AIG-00,694304G,9/1/2017,9/30/2017,740,AIG-00201709,10/10/2017,USD,MC
I need to turn this data into the following XML format:
<?xml version="1.0" encoding="UTF-8"?>
<BillingAndCosting version="1.0">
<ControlArea>
<SenderId>CMS-BILLING-100</SenderId>
<WaterMark>92030293829329392030232</WaterMark>
<RecordCount>2</RecordCount>
<TimeStamp>2001-12-17T09:30:47-05:00</TimeStamp>
<DataFileName>String</DataFileName>
</ControlArea>
<DataArea>
<CustomerAccount>
<ExternalKey>1001</ExternalKey>
<ExternalSource>HBVenture</ExternalSource>
<BillingData>
<ReferenceID>0</ReferenceID>
<BillingInvoiceNumber>2000016</BillingInvoiceNumber>
<BillingInvoiceDate>2017-01-31T06:42:07.000Z</BillingInvoiceDate>
<BillingPeriodFromDate>2017-01-01T06:42:07.000Z</BillingPeriodFromDate>
<BillingPeriodThruDate>2017-01-31T06:42:07.000Z</BillingPeriodThruDate>
<BillingInvoiceType>NEW</BillingInvoiceType>
<BillingAmount CurrencyID="USD">1290.39</BillingAmount>
<InvoiceItem>
<CategoryCode>res-group</CategoryCode>
<TaxCategoryID>C2</TaxCategoryID>
<InvoiceItemAmount CurrencyID="USD">1290.39</InvoiceItemAmount>
<ProductID>694601F</ProductID>
<ISVUID>1</ISVUID>
</InvoiceItem>
</BillingData>
<BillingData>
<ReferenceID>0</ReferenceID>
<BillingInvoiceNumber>2000017</BillingInvoiceNumber>
<BillingInvoiceDate>2017-01-31T06:42:07.000Z</BillingInvoiceDate>
<BillingPeriodFromDate>2017-01-01T06:42:07.000Z</BillingPeriodFromDate>
<BillingPeriodThruDate>2017-01-31T06:42:07.000Z</BillingPeriodThruDate>
<BillingInvoiceType>NEW</BillingInvoiceType>
<BillingAmount CurrencyID="USD">590.39</BillingAmount>
<InvoiceItem>
<CategoryCode>gateway_resource_group</CategoryCode>
<TaxCategoryID>C2</TaxCategoryID>
<InvoiceItemAmount CurrencyID="USD">590.39</InvoiceItemAmount>
<ProductID>694601F</ProductID>
<ISVUID>1</ISVUID>
</InvoiceItem>
</BillingData>
I'm showing only a portion of the XML for brevity. The thing I don't know is: how to use Spring Batch to read the entire CSV file to populate an Object that I can then send to the XML Marshaller for converting into XML.
Can it be done with Spring batch, or do I have to 'roll my own'?

We use StaxEventItemWriter as ItemWriter (example below)
<bean id = "xmlItemWriter"
class = "org.springframework.batch.item.xml.StaxEventItemWriter">
<property name = "resource" value = "file:my_path_to_xml.xml" />
<property name = "marshaller" ref = "reportMarshaller" />
<property name = "rootTagName" value = "BillingAndCosting" />
</bean>
<bean id = "reportMarshaller"
class = "org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name = "classesToBeBound">
<list>
<value>BillingAndCosting</value>
</list>
</property>
</bean>

You can use FlatFileItemReader for reading the CSV file. Follow the below sample code:
ItemReader reads a complete line one by one from input csv file. FlatFileItemReader is used for reading the csv file.
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="fieldSetMapper">
<!-- Mapper which maps each individual items in a record to properties in POJO -->
<bean class="com.websystique.springbatch.ExamResultFieldSetMapper" />
</property>
<property name="lineTokenizer">
<!-- A tokenizer class to be used when items in input record are separated by specific characters -->
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value="|" />
</bean>
</property>
</bean>
</property>
</bean>
XML ItemWriter which writes the data in XML format.
<bean id="xmlItemWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
<property name="resource" value="file:xml/examResult.xml" />
<property name="rootTagName" value="UniversityExamResultList" />
<property name="marshaller">
<bean class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>com.websystique.springbatch.model.ExamResult</value>
</list>
</property>
</bean>
</property>
</bean>
You should also go through the official Spring docs Spring Docs

Spring Batch - Last item from the reader alone is getting updated

I have to read from a file (FlatFile) and update a column if that ID present in the file matches the id in the column.The file is being read properly but only the last id value is getting updated here . Please find the snippet
Job-Config.xml
<bean id="abcitemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="resource" value="file:datafile/outputs/ibdData.txt" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="names" value="ID,NAM,TYPE" />
<property name="delimiter" value="|"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="com.pershing.intraware.springbatch.mapper.abcFieldsetMapper" />
</property>
</bean>
</property>
</bean>
<bean id="abcitemWriter" class="org.springframework.batch.item.database.JdbcBatchItemWriter" scope="step">
<property name="dataSource" ref="dataSource" />
<property name="sql"><value>UPDATE TEST_abc SET BIZ_ARNG_CD = CASE WHEN ID IN (SELECT ID FROM TEST_abc WHERE ID= ? and MONTH=(to_char(sysdate, 'MM')) AND YR =(to_char(sysdate, 'YY'))) THEN 'Y' ELSE 'N' END</value></property>
<!-- It will take care matching between object property and sql name parameter -->
<property name="itemPreparedStatementSetter" ref="testPrepStatementSetter" />
</bean>
</beans>
Setter.java
public class IDItemPreparedStatementSetter implements ItemPreparedStatementSetter<Test> {
#Override
public void setValues(Test item, PreparedStatement ps) throws SQLException {
// TODO Auto-generated method stub
ps.setString(1, item.getID());
}
}

Your query is updating each row of database every time it is fired. You need to restrict that. Currently; it must be setting the BIZ_ARNG_CD to 'Y' for records with ID equal to the ID of the last record passed to the writer.
You can fix this in 2 ways -
Default the database column to 'N' and don't set it to 'N' in the update statement
Add where clause in update script ( BIZ_ARNG_CD != 'Y')

Reading in UTF-8 encoding in Java results in issues at only some places

I am reading a UTF_8 encoded file byte by byte in Java Spring Batch and saving the content in mongodb. The reading is delegated to FlatFileItemReader and the encoding is injected into the bean as UTF-8.
What I do see is that only at one or two specific places, the umlaut character is getting stored with an encoding problem in mongodb. I verified that the content in the original file is okay. What is interesting is that everywhere else, the umlaut character in the original file is interpreted correctly and stored correctly in mongodb.
Could you help with this?
<bean id="recordReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="resource" value="file:#{jobExecutionContext['inputfile']}" />
<property name="linesToSkip" value = "1" />
<property name= "encoding" value ="UTF-8"/>
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper" >
<property name = "lineTokenizer">
<bean class = "com.xyz.NoQuoteDelimitedLineTokenizer">
<property name = "strict" value = "false" />
<property name = "names" value = "Col1,Col2,Col3,Col4,Col5,Col6,Col7,Col8" />
<property name="delimiter">
<util:constant static-field="org.springframework.batch.item.file.transform.DelimitedLineTokenizer.DELIMITER_TAB" />
</property>
</bean>
</property>
<property name = "fieldSetMapper">
<bean class = "com.xyz.MyFieldSetMapper" >
<property name = "ctx" value="#{jobExecutionContext['contextBean']}"/>
</bean>
</property>
</bean>
</property>
</bean>
FieldSetMapper
private void validateFields (String id, ConsumerDO ci, FieldSet fieldset_p) throws Exception
{try {
ci.setCaption(fieldset_p.readString("newCaption"));
}
catch (Exception e) {
}
}

We want to get all the records in Database using HQL in terms of Spring and Hibernate

Here i tried 2 write cod 2 get list from Mysql Database using spring nd Hibernate.
But problem is here that how initialize **org.hibernate.Session se through "applicationContext.xml" file by bean class....
public void getList(**Session se**){
String liststudent="from StudentList stud";
Query q=se.createQuery(liststudent);
List<Object> list=q.list();
for(Object obj:list){
Object studarr[]=(Object[])obj;
System.out.println("Data at Zero Index"+studarr[0]);
}
}
As here property name **template** has been initialized by the ref template.
Is there any way to initialize Session se.
<bean name="mydao" class="dao.MyDao">
<property name="template" ref="template"></property>
</bean>

using your context.xml file
define sessionFactory
<bean id="sessionFactory" class="LocalSessionFactoryBean">
<property name="dataSource" ref="dataSource"/>
<!--define other properties...mapping files
</bean>
define hibernateTemplate
<bean id="template" class="*HibernateTemplate">
<property name="sessionFactory" ref="sessionFactory" />
</bean>
then
<bean name="mydao" class="dao.MyDao">
<property name="template" ref="template"></property>
</bean>
NOTE: the class package has been ommitted

How can I get the number of line and the filename in itemProcessor - spring batch

I am using spring batch to parse my files. In ItemProcessor I validate if the incoming fields are correct. If they are not I want to throw a ValidationException and log to a file the corresponding row which has the incorrect fields. So, how can I find the number of line and the filename in ItemProcessor?

Without seeing you ItemReader config I can't really be sure but if you are using something like FlatFileItemReader to parse a csv, if in strict mode it will validate the number of columns.
Assuming you reader looks like this, that is:
<bean id="iItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="linesToSkip" value="1"/>
<property name="comments" value="#" />
<property name="encoding" value="UTF-8"/>
<property name="lineMapper" >
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value=","/>
<property name="names">
<list >
<value>First_Field</value>
<value>Second_Field</value>
</list>
</property>
<property name="strict" value="true"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="uk.co.package.FieldSetMapper">
<property name="dateFormat" value="yyyy-MM-dd HH:mm:ss"/>
</bean>
</property>
</bean>
</property>
</bean>
It will throw a FlatFileParseException for any lines that can't be processed. This includes the line number and can be handled in a listener.

As for the line number, you might build your own LineMapper and then store the line-number in your business object. An example in which I store the line unprocessed (as-is) together with the line number:
DefaultLineMapper<OneRow> lineMapper = new DefaultLineMapper<OneRow>() {
#Override
public OneRow mapLine(String line, int lineNumber) throws Exception {
return new OneRow(lineNumber, line);
}
};
Of course you can already map your Object, I had the need to have the whole line unprocessed as input to my Processors.
As a reference with the same idea: https://stackoverflow.com/a/23770421/5658642

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

read only selective columns from csv file using spring batch - spring

there is one csv file having 100 columns, but we need only 3-5 columns which needs to be loaded into database. I dont want to specify all the 100 columns in linetokenizer in job xml. Please suggest how we can proceed in this case

Related

How can I turn contents of CSV file into XML using Spring batch?

Spring Batch - Last item from the reader alone is getting updated

Reading in UTF-8 encoding in Java results in issues at only some places

We want to get all the records in Database using HQL in terms of Spring and Hibernate

How can I get the number of line and the filename in itemProcessor - spring batch

Categories

Resources