Spring Batch - Issue with PageSize in JdbcPagingItemReader - spring

Hi We are working on a spring batch, which processes all the SKUs in SKU table and send a request to inventory system to get the inventory details. To send to invetory details we need to send 100 SKI ids at a time so we have set the pageSize as 100.
in the reader log:
we see
SELECT * FROM (SELECT S_ID ,S_PRNT_PRD,SQ, ROWNUM as TMP_ROW_NUM FROM
XXX_SKU WHERE SQ>=:min and SQ <=:max ORDER BY SQ ASC) WHERE ROWNUM <=
100]
But we observe in the WRITER that is for certain time 100 SKU are sent and for certain requests only 1 SKU is sent.
public void write(List<? extends XXXPagingBean> pItems) throws XXXSkipItemException {
if (mLogger.isLoggingDebug()) {
mLogger.logDebug("XXXInventoryServiceWriter.write() method STARTING, ItemsList size:{0}" +pItems.size());
}
....
....
}
pageSize and commitInterval is set to 100 (are these suppose to be same?)
sortKey (SEQ_ID) should be same a column use in partitiner?
Bean configurations in XML:
<!-- InventoryService Writer configuration -->
<bean id="inventoryGridService" class="atg.nucleus.spring.NucleusResolverUtil" factory-method="resolveName">
<constructor-arg value="/com/XXX/gigaspaces/inventorygrid/service/InventoryGridService" />
</bean>
<bean id="inventoryWriter" class="com.XXX.batch.integrations.XXXXXX.XXXXInventoryServiceWriter" scope="step">
<property name="jdbcTemplate" ref="batchDsTemplate"></property>
<property name="inventoryGridService" ref="inventoryGridService" />
</bean>
<bean id="pagingReader" class="org.springframework.batch.item.database.JdbcPagingItemReader" xmlns="http://www.springframework.org/schema/beans" scope="step">
<property name="dataSource" ref="dataSource" />
<property name="queryProvider">
<bean id=" productQueryProvider" class="org.springframework.batch.item.database.support.SqlPagingQueryProviderFactoryBean">
<property name="dataSource" ref="dataSource" />
<property name="selectClause" value="select S_ID ,S_PRNT_PRD" />
<property name="fromClause" value="from XXX_SKU" />
<property name="sortKey" value="SEQ_ID" />
<property name="whereClause" value="SEQ_ID>=:min and SEQ_ID <=:max"></property>
</bean>
</property>
<property name="parameterValues">
<map>
<entry key="min" value="#{stepExecutionContext[minValue]}"></entry>
<entry key="max" value="#{stepExecutionContext[maxValue]}"></entry>
</map>
</property>
<property name="pageSize" value="100" />
<property name="rowMapper">
<bean class="com.XXX.batch.integrations.endeca.XXXPagingRowMapper"></bean>
</property>
</bean>
Please suggest.

Remove your whereClause from the productQueryProvider bean definition and get rid of your parameterValues and it should work. The PagingQueryProvider takes care of paging automatically for you. There's no need to do that manually yourself.

Related

Reg Spring Batch Transaction

My Requirement is I need to have two datasource connected to Spring Batch Application.
1) One for Spring Batch Jobs and Executions storing
2) One for Business Data Stroing, Processing and Retreiving.
I know that there are lot of solutions for achieving this. But I have achieved by setting the second datasource as primary. The problem is the second datasource is not coming under transaction scope instead it is committing for each sql statement executing expecially through jdbctemplate.
As I can't able to edit my question. I am writing another Post in detail
My Requirement is I need to have two datasource connected to Spring Batch Application.
1) One for Spring Batch Jobs and Executions storing
2) One for Business Data Stroing, Processing and Retreiving.
In env-context.xml I have following configuration
<!-- Enable annotations-->
<context:annotation-config/>
<bean primary="true" id="dataSource" class="org.springframework.jndi.JndiObjectFactoryBean">
<property name="jndiName" value="java:/DB2XADS"/>
</bean>
<!-- Creating TransactionManager Bean, since JDBC we are creating of type
DataSourceTransactionManager -->
<bean id="transactionManager" primary="true"
class="org.springframework.jdbc.datasource.DataSourceTransactionManager">
<property name="dataSource" ref="dataSource" />
</bean>
<!-- jdbcTemplate uses dataSource -->
<bean id="jdbcTemplate" class="org.springframework.jdbc.core.JdbcTemplate">
<property name="dataSource" ref="dataSource" />
</bean>
<bean id="batchTransactionManager"
class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
<bean id="transactionTemplate"
class="org.springframework.transaction.support.TransactionTemplate">
<property name="transactionManager" ref="transactionManager" />
</bean>
In override-context.xml I have the following code
<tx:annotation-driven transaction-manager="transactionManager" />
<!-- jdbcTemplate uses dataSource -->
<bean id="batchDataSource" class="org.springframework.jndi.JndiObjectFactoryBean">
<property name="jndiName" value="java:/MySqlDS"/>
</bean>
<bean class="com.honda.pddabulk.utility.MyBatchConfigurer">
<property name="dataSource" ref="batchDataSource" />
</bean>
<!-- Use this to set additional properties on beans at run time -->
<bean id="placeholderProperties"
class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
<property name="locations">
<list>
<value>classpath:/org/springframework/batch/admin/bootstrap/batch.properties
</value>
<value>classpath:/batch/batch-mysql.properties</value>
<value>classpath:log4j.properties</value>
</list>
</property>
<property name="systemPropertiesModeName" value="SYSTEM_PROPERTIES_MODE_OVERRIDE"/>
<property name="ignoreResourceNotFound" value="true"/>
<property name="ignoreUnresolvablePlaceholders" value="true"/>
<property name="order" value="1"/>
</bean>
<!-- Overrider job repository -->
<bean id="jobRepository"
class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean">
<property name="databaseType" value="mysql"/>
<property name="dataSource" ref="batchDataSource"/>
<property name="tablePrefix" value="${batch.table.prefix}"/>
<property name="maxVarCharLength" value="2000"/>
<property name="isolationLevelForCreate" value="ISOLATION_SERIALIZABLE"/>
<property name="transactionManager" ref="batchTransactionManager"/>
</bean>
<!-- Override job service -->
<bean id="jobService" class="org.springframework.batch.admin.service.SimpleJobServiceFactoryBean">
<property name="tablePrefix" value="${batch.table.prefix}"/>
<property name="jobRepository" ref="jobRepository"/>
<property name="jobLauncher" ref="jobLauncher"/>
<property name="jobLocator" ref="jobRegistry"/>
<property name="dataSource" ref="batchDataSource"/>
</bean>
<!-- Override job launcher -->
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
<property name="taskExecutor" ref="jobLauncherTaskExecutor" />
</bean>
<task:executor id="jobLauncherTaskExecutor" pool-size="21" rejection-policy="ABORT" />
<!-- Override job explorer -->
<bean id="jobExplorer"
class="org.springframework.batch.core.explore.support.JobExplorerFactoryBean">
<property name="tablePrefix" value="${batch.table.prefix}"/>
<property name="dataSource" ref="batchDataSource"/>
</bean>
In job-config.xml I have the following code
<context:component-scan base-package="com.honda.*">
<context:exclude-filter type="regex"
expression="com.honda.pddabulk.utility.MyBatch*" />
</context:component-scan>
I have the custom Batch configurer set. Now the problem is when I try to execute queries with jdbctemplate for update and insert it is not under transaction which means #Transactional is not working.
Rather commit is happening for each method call. The example is
#Transactional
public void checkInsertion() throws Exception{
try{
jdbcTemplate.update("INSERT INTO TABLE_NAME(COLUMN1, COLUMN2) VALUES( 'A','AF' );
throw new PddaException("custom error");
}catch(Exception ex){
int count=jdbcTemplate.update("ROLLBACK");
log.info("DATA HAS BEEN ROLLBACKED SUCCESSFULLY... "+count);
throw ex;
}
}
In the above code I am trying to insert data and immediately I am also throwing a exception which means the insert should happen but commit will not. So we will not be able to see any data but unfortunately the commit is happening. Please some one help

how to use limit and offset clause in JdbcPagingItemReader in spring batch?

The table has more than 200 million records, but i need to restrict the select top 5 million records. I have tried with jdbcCursorItemReader which is taking around 2-3 hrs to select and write it to the csv file using single step by chunk processing, So i choose to go with parallel processing that spring is batch offering.
i,e by having taskExecutor and JdbcPagingItemReader making each 5 individual files of million each but the problem is i am not able to specify the limit and offset clause in query parameters. please help me on this. Approach better than this too is appreciated.
<bean id="itemReader" class="org.springframework.batch.item.database.JdbcPagingItemReader" scope="step">
<property name="dataSource" ref="dataSource" />
<property name="rowMapper">
<bean class="MyRowMapper" />
</property>
<property name="queryProvider">
<bean class="org.springframework.batch.item.database.support.SqlPagingQueryProviderFactoryBean">
<property name="dataSource" ref="dataSource" />
<property name="sortKeys">
<map>
<entry key="esmeaddr" value="ASCENDING"/>
</map>
</property>
<property name="selectClause" value="elect cust_send,dest,msg,stime,dtime,dn_status,mid,rp,operator,circle,cust_mid,first_attempt,second_attempt,third_attempt,fourth_attempt,fifth_attempt,term_operator,term_circle,bindata,reason,tag1,tag2,tag3,tag4,tag5"
/>
<property name="fromClause" value="FROM bill_log " />
<property name="whereClause" value="where esmeaddr = '70897600000000' and country='India' and apptype='SMS' Limit 0,1000000" />
</bean>
</property>
<property name="pageSize" value="1000000" />
<property name="parameterValues">
<map>
<entry key="param1" value="#{jobExecutionContext[param1]}" />
<entry key="param2" value="#{jobExecutionContext[param2]}" />
</map>
</property>
</bean>
You can't use a SQL LIMIT clause within that reader since that's what the reader itself will do. Instead, Spring Batch has the functionality built into the JdbcPagingItemReader. To set the max number of items to process, you can configure the reader with JdbcPagingItemReader#setMaxItemCount(5000000) and if there is an offset, you would set the JdbcPagingItemReader#setCurrentItemCount(offset). That being said, the offset will be overriden on a restart with any value it finds in the ExecutionContext. You can read more about this in the javadoc here: https://docs.spring.io/spring-batch/trunk/apidocs/org/springframework/batch/item/database/JdbcPagingItemReader.html

How to use ORALCE SQL SEQUENCE in spring batch writer.?

can someone please let me know how to use oracle sequence in spring batch writer?.
I tried using custseq.nextval in the insert statement, but its failing.
<bean id="testSSRReader"
class="org.springframework.batch.item.database.JdbcCursorItemReader">
<property name="dataSource" ref="bconnectedDataSource" />
<property name="sql"
value="select CUST_USA_ID , CUST_FIRST_NAME , CUST_LAST_NAME from BL_CUSTOMER fetch first 100 rows only" />
<property name="rowMapper">
<bean class="com.macys.batch.rowmapper.TestSSRRowMapper" />
</property>
</bean>
<bean id="testSSRProcessor" class="com.macys.batch.processor.TestSSRProcessor" />
<bean id="testSSRWriter"
class="org.springframework.batch.item.database.JdbcBatchItemWriter">
<property name="dataSource" ref="ocDataSource" />
<property name="sql">
<value>
<![CDATA[
insert into TESTTABLESSR(CUSTOMER_ID,CUSTOMER_NAME,CITY)
VALUES (custseq.nextval,:firstName,:lastName)
]]>
</value>
</property>
<property name="itemSqlParameterSourceProvider" ref="itemSqlParameterSourceProvider" />
</bean>

JDBCBatchItemWriter not receiving the List for batch update

I'm a new bee to spring batch. My requirement is to fetch records from a DB table, process them(each record can be processed independently so i'm partitioning and using a task executor) and then update the status column in the same table based on processing status.
Simplified version of my code is below.
Item Reader (My custom column partitioner will decide the min & max value below):
<bean name="databaseReader" class="org.springframework.batch.item.database.JdbcCursorItemReader" scope="step">
<property name="dataSource" ref="dataSource"/>
<property name="sql">
<value>
<![CDATA[
select id,user_login,user_pass,age from users where id >= #{stepExecutionContext['minValue']} and id <= #{stepExecutionContext['maxValue']}
]]>
</value>
</property>
<property name="rowMapper">
<bean class="com.springapp.batch.UserRowMapper" />
</property>
<property name="verifyCursorPosition" value="false"/>
</bean>
Item Processor:
<bean id="itemProcessor" class="com.springapp.batch.UserItemProcessor" scope="step"/>
....
public class UserItemProcessor implements ItemProcessor<Users, Users>
{
#Override
public Users process(Users users) throws Exception {
// do some processing here..
//update users status
//users.setStatus(users.getId() + ": Processed by :" + Thread.currentThread().getName() + ": Processed at :" + new GregorianCalendar().getTime().toString());
//System.out.println("Processing user :" + users + " :" +Thread.currentThread().getName());
return users;
}
}
Item Writer:
<bean id="databaseWriter" class="org.springframework.batch.item.database.JdbcBatchItemWriter">
<property name="dataSource" ref="dataSource" />
<property name="sql">
<value>
<![CDATA[
update users set status = :status where id= :id
]]>
</value>
</property>
<property name="itemSqlParameterSourceProvider">
<bean class="org.springframework.batch.item.database.BeanPropertyItemSqlParameterSourceProvider" />
</property>
</bean>
Step configuration:
<batch:job id="usersJob">
<batch:step id="stepOne">
<batch:partition step="worker" partitioner="myColumnRangepartitioner" handler="partitionHandler" />
</batch:step>
</batch:job>
<batch:step id="worker" >
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="databaseReader" writer="databaseWriter" commit-interval="5" processor="itemProcessor" />
</batch:tasklet>
</batch:step>
<bean id="asyncTaskExecutor" class="org.springframework.core.task.SimpleAsyncTaskExecutor" />
<bean id="partitionHandler" class="org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler" scope="step">
<property name="taskExecutor" ref="asyncTaskExecutor"/>
<property name="step" ref="worker" />
<property name="gridSize" value="3" />
</bean>
Since i have specified the commit interval as 5 my understanding is that when 5 items are processed by a partition, it will call JDBCItemWriter with a List of 5 Users object to perform a batch JDBC update. However with the current setup, i'm receiving 1 User object at a time during batch update.
Is my understanding above correct or am i missing any step/configuration ?
Note: I'm using HSQL file based database for testing.
<bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">
<property name="driverClassName" value="org.hsqldb.jdbc.JDBCDriver"/>
<property name="url" value="jdbc:hsqldb:file:C://users.txt"/>
<property name="username" value="sa"/>
<property name="password" value=""/>
</bean>

Spring Batch: Reading a File : if field is empty setting the default value

I am very new to spring batch. I have requirement in which i have to read a file having a header(Field Names) record and data records
i have to validate 1st record (check the field names matching against set of predefined names)- note that this record need to be skipped- i mean should not be part of items in processor)
read and store rest of the field values to a POJO
if the field 'date' is empty , i need to set the default value as 'xxxx-yy-zz'
i am unable to 1st and 3rd requirement with batch
here is the sample reader XML. please help
<bean id="reader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="classpath:input/import" />
<property name="encoding" value="UTF-8" />
<property name="linesToSkip" value="1" />
<property name="lineMapper" ref="line.mapper"/>
</bean>
<bean id="line.mapper" class="org.springframework.batch.item.file.mapping .DefaultLineMapper">
<property name="lineTokenizer" ref="line.tokenizer"/>
<property name="fieldSetMapper" ref="fieldSet.enity.mapper"/>
</bean>
<bean id="line.tokenizer" class="org.springframework.batch.item.file.transfo rm.DelimitedLineTokenizer">
<property name="delimiter">
<util:constant static-field="org.springframework.batch.item.file.transfo rm.DelimitedLineTokenizer.DELIMITER_TAB"/>
</property>
<property name="names" value="id,date,age " />
<property name="strict" value="false"/>
</bean>
<bean id="fieldSet.enity.mapper" class="org.springframework.batch.item.file.mapping .BeanWrapperFieldSetMapper">
<property name="targetType" value="a.b.myPOJO"/>
<property name="customEditors">
<map>
<entry key="java.util.Date">
<bean class="org.springframework.beans.propertyeditors.C ustomDateEditor">
<constructor-arg>
<bean class="java.text.SimpleDateFormat">
<constructor-arg value="yyyy-mm-dd" />
</bean>
</constructor-arg>
<constructor-arg value="true" />
</bean>
</entry>
</map>
</property>
Create your own custom FieldSetMapper like below
CustomeFieldSetMapper implements FieldSetMapper<a.b.myPOJO> {
#Override
public a.b.myPOJO mapFieldSet(FieldSet fs) {
a.b.myPOJO myPOJO = new a.b.myPOJO();
if(fs.readString("date").isEmpty()){
myPOJO.setDate("xxxx-yy-zz");
}
return a.b.myPOJO;
}
}
You think you should do date set in ItemProcessor.
Also, if <property name="linesToSkip" value="1" /> not fill your requirements - extend FlatFileItemReader and validate first line manually in it.

Resources