spring batch - how to execute purge tasklet - spring

I have written a simple spring batch tasklet which calls a dao method which in turn does some deletes. But I am not sure what I should be doing to call the job.
public class RemoveSpringBatchHistoryTasklet implements Tasklet {
#Autowired
private SpringBatchDao springBatchDao;
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext)
throws Exception {
contribution.incrementWriteCount(springBatchDao.purge());
return RepeatStatus.FINISHED;
}
}
So far to execute my spring batch jobs I am using quartz triggers with a setup like so. Each job has it's own xml file which has a read and a writer.
<bean class="org.springframework.scheduling.quartz.SchedulerFactoryBean">
<property name="jobDetails">
<list>
<ref bean="dailyTranCountJobDetail" />
</list>
</property>
<property name="triggers">
<list>
<ref bean="dailyTranCountCronTrigger" />
</list>
</property>
</bean>
<bean id="dailyTranCountCronTrigger" class="org.springframework.scheduling.quartz.CronTriggerBean">
<property name="jobDetail" ref="dailyTranCountJobDetail" />
<property name="cronExpression" value="#{batchProps['cron.dailyTranCounts']}" />
</bean>
<bean id="dailyTranCountJobDetail" class="org.springframework.scheduling.quartz.JobDetailBean">
<property name="jobClass" value="com.myer.reporting.batch.JobLauncherDetails" />
<property name="group" value="quartz-batch" />
<property name="jobDataAsMap">
<map>
<entry key="jobName" value="job-daily-tran-counts" />
<entry key="jobLocator" value-ref="jobRegistry" />
<entry key="jobLauncher" value-ref="jobLauncher" />
</map>
</property>
</bean>
And then here is an example of the job file itself with a reader and a writer.
<job id="job-daily-tran-counts" xmlns="http://www.springframework.org/schema/batch">
<step id="job-daily-tran-counts-step1">
<tasklet transaction-manager="custDbTransactionManager">
<chunk
reader="dailyTranCountJdbcCursorItemReader"
writer="dailyTranCountItemWriter"
commit-interval="1000" />
</tasklet>
</step>
</job>
<bean id="dailyTranCountJdbcCursorItemReader"
class="com.myer.reporting.dao.itemreader.DailyTranCountJdbcCursorItemReader"
scope="step"
parent="abstractEposJdbcDao">
<property name="rowMapper">
<bean class="com.myer.reporting.dao.mapper.DailyTranCountMapper" />
</property>
</bean>
<bean id="dailyTranCountItemWriter"
class="com.myer.reporting.dao.itemwriter.DailyTranCountItemWriter"
parent="abstractCustDbJdbcDao"/>
Obviously for this new job there is no reader or writer. So what it he best/correct way for me to execute my new tasklet?
thanks

I prefer java configuration instead of xml. You can configure your tasklet with the following code:
#Configuration
#EnableBatchProcessing
public class BatchCleanUpJobsConfiguration {
#Bean
public Job batchCleanUpJob(final JobBuilderFactory jobBuilderFactory,
final StepBuilderFactory stepBuilderFactory,
final RemoveSpringBatchHistoryTasklet removeSpringBatchHistoryTasklet) {
return jobBuilderFactory.get("batchCleanUpJob")
.start(stepBuilderFactory.get("batchCleanUpStep")
.tasklet(removeSpringBatchHistoryTasklet)
.build())
.build();
}
#Bean
public RemoveSpringBatchHistoryTasklet batchCleanUpTasklet(final JdbcTemplate jdbcTemplate) {
final var tasklet = new RemoveSpringBatchHistoryTasklet();
tasklet.setJdbcTemplate(jdbcTemplate);
return tasklet;
}
}
To schedule your new job use the following code:
#Component
#RequiredArgsConstructor
public class BatchCleanUpJobsScheduler {
private final Job batchCleanUpJob;
private final JobLauncher launcher;
#Scheduled(cron = "0 0 0 * * MON-FRI")
public void launchBatchCleanupJob()
throws JobParametersInvalidException, JobExecutionAlreadyRunningException,
JobRestartException, JobInstanceAlreadyCompleteException {
launcher.run(
batchCleanUpJob,
new JobParametersBuilder()
.addLong("launchTime", System.currentTimeMillis())
.toJobParameters());
}
}

Related

Spring Batch- Xml based configuration performance over Java based

I am trying to convert the spring batch configuration from xml based to annotation based.
Below is my xml based configuration.
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean" />
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
</bean>
<!-- Step will need a transaction manager -->
<bean id="transactionManager"
class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
<bean id="dbMapper" class="org.test.DBValueMapper">
</bean>
<bean id="dbMapperFlatfile" class="org.test.FlatFileRowMapper">
</bean>
<bean id="paramSetter" class="org.test.DBParamSetter">
</bean>
<bean id="dbReader" class="org.test.DBValueReader"
scope="step">
<property name="paramSetter" ref="paramSetter"/>
<property name="verifyCursorPosition" value="false" />
<property name="dataSource" ref="dataSource" />
<property name="sql" value="#{jobParameters['SQL_QUERY']}" />
<property name="rowMapper" ref="dbMapper" />
<property name="fetchSize" value="5000" />
</bean>
<bean id="dbWriterIO" class="org.test.TemplateWritterIO"
scope="step">
<property name="velocityEngine" ref="velocityEngine" />
<!-- <property name="rptConfig" value="#{jobParameters['RPT_CONFIGVAL']}" /> -->
<property name="headerCallback" ref="dbWriterIO" />
<property name="footerCallback" ref="dbWriterIO" />
</bean>
<batch:job id="fileGenJobNio">
<batch:step id="fileGenJobStempNio">
<batch:tasklet>
<batch:chunk reader="dbReader" writer="dbWriterNIO"
commit-interval="5000">
</batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
Below is the equivalent Java based configuration:
#EnableBatchProcessing
#Import({ServiceConfiguration.class})
public class SRBatchGenerator extends DefaultBatchConfigurer{
#Autowired
private JobBuilderFactory jobBuilders;
#Autowired
private StepBuilderFactory stepBuilders;
#Autowired
private VelocityEngine velocityEngine;
#Autowired
private DBValueMapper mapper;
#Autowired
private DbHelper dbhelper;
#Autowired
private DataSource datasource;
#Bean
public Step step(){
return stepBuilders.get("step")
.chunk(5000)
.reader(reader())
//.processor(processor())
.writer(writer())
//.listener(logProcessListener())
.faultTolerant()
//.skipLimit(10)
//.skip(UnknownGenderException.class)
//.listener(logSkipListener())
.build();
}
#Bean
public Job fileGeneratorJob(){
return jobBuilders.get("fileGeneratorJob")
//.listener(protocolListener())
.start(step())
.build();
}
#Bean
public DBValueMapper mapper(){
return new DBValueMapper();
}
#Bean
#StepScope
public DBValueReader3 reader(){
String query="Select Test1,Test2,test3,test4 from RPt_TEST";
DBValueReader3 dbread = new DBValueReader3();
dbread.setSql(query);
dbread.setRowMapper(mapper);
dbread.setDataSource(datasource);
return dbread;
}
#Bean
#StepScope
public TemplateWritterIO writer(){
TemplateWritterIO writer=new TemplateWritterIO();
writer.setVelocityEngine(velocityEngine);
return writer;
}
#Override
protected JobRepository createJobRepository() throws Exception {
MapJobRepositoryFactoryBean factory =
new MapJobRepositoryFactoryBean();
factory.afterPropertiesSet();
return (JobRepository) factory.getObject();
}
}
When I execute my Job using xml based it took 27sec to write 1 Million record into Flat file.
But to write same 1 million record, Java based job took about 2 hours to write.
I am not sure what I am missing here. Can anyone help me or guide me why it is slow in Java based configuration.

Why Spring batch executing as singleton instead of multithread

I am invoking a spring batch job through quartz scheduler, which should run every 1 minute.
When the job runs the first time, the ItemReader is opened successfully and the job runs. However when the job attempts to run a second time, it's using the same instance it did the first time which is already initialized and receiving "java.lang.IllegalStateException: Stream is already initialized. Close before re-opening." I have set scope as step for both itemreader and itemwriter.
Please let me know if I am doing anything wrong in configuration?
<?xml version="1.0" encoding="UTF-8"?>
<import resource="context.xml"/>
<import resource="database.xml"/>
<bean id="MyPartitioner" class="com.MyPartitioner" />
<bean id="itemProcessor" class="com.MyProcessor" scope="step" />
<bean id="itemReader" class="com.MyItemReader" scope="step">
<property name="dataSource" ref="dataSource"/>
<property name="sql" value="query...."/>
<property name="rowMapper">
<bean class="com.MyRowMapper" scope="step"/>
</property>
</bean>
<job id="MyJOB" xmlns="http://www.springframework.org/schema/batch">
<step id="masterStep">
<partition step="slave" partitioner="MyPartitioner">
<handler grid-size="10" task-executor="taskExecutor"/>
</partition>
</step>
</job>
<step id="slave" xmlns="http://www.springframework.org/schema/batch">
<tasklet>
<chunk reader="itemReader" writer="mysqlItemWriter" processor="itemProcessor" commit-interval="100"/>
</tasklet>
</step>
<bean id="taskExecutor" class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
<property name="corePoolSize" value="20"/>
<property name="maxPoolSize" value="20"/>
<property name="allowCoreThreadTimeOut" value="true"/>
</bean>
<bean id="mysqlItemWriter" class="com.MyItemWriter" scope="step">
<property name="dataSource" ref="dataSource"/>
<property name="sql">
<value>
<![CDATA[
query.....
]]>
</value>
</property>
<property name="itemPreparedStatementSetter">
<bean class="com.MyPreparedStatementSetter" scope="step"/>
</property>
</bean>
Quartz job invoker-
Scheduler scheduler = new StdSchedulerFactory("quartz.properties").getScheduler();
JobKey jobKey = new JobKey("QUARTZJOB", "QUARTZJOB");
JobDetail jobDetail = JobBuilder.newJob("com.MySpringJobInvoker").withIdentity(jobKey).build();
jobDetail.getJobDataMap().put("jobName", "SpringBatchJob");
SimpleTrigger smplTrg = newTrigger().withIdentity("QUARTZJOB", "QUARTZJOB").startAt(new Date(startTime))
.withSchedule(simpleSchedule().withIntervalInSeconds(frequency).withRepeatCount(repeatCnt))
.forJob(jobDetail).withPriority(5).build();
scheduler.scheduleJob(jobDetail, smplTrg);
Quartz job -
public class MySpringJobInvoker implements Job
{
#Override
public void execute(JobExecutionContext jobExecutionContext) throws JobExecutionException
{
JobDataMap data = jobExecutionContext.getJobDetail().getJobDataMap();
ApplicationContext applicationContext =ApplicationContextUtil.getInstance();
JobLauncher jobLauncher = (JobLauncher) applicationContext.getBean("jobLauncher");
org.springframework.batch.core.Job job = (org.springframework.batch.core.Job) applicationContext.getBean(data.getString("jobName"));
JobParameters param = new JobParametersBuilder().addString("myparam","myparam").addString(Long.toString(System.currentTimeMillis(),Long.toString(System.currentTimeMillis())).toJobParameters();
JobExecution execution = jobLauncher.run(job, param);
}
}
Singletonclass -
public class ApplicationContextUtil
{
private static ApplicationContext applicationContext;
public static synchronized ApplicationContext getInstance()
{
if(applicationContext == null)
{
applicationContext = new ClassPathXmlApplicationContext("myjob.xml");
}
return applicationContext;
}
}
what are the parameters that you are passing to the Spring Batch job from Quartz? Could you post the exception stack trace?
If your trying to execute the second instance of the batch with the same parameters - it won't work. Spring Batch identifies a unique instance of the job based on parameters passed - so every new instance of the job requires different parameters to be passed.

Spring Jms #JmsListener annotation doesnt work

I have such method
#JmsListener(containerFactory = "jmsListenerContainerFactory", destination = "myQName")
public void rceive(MySerializableObject message) {
log.info("received: {}", message);
}
and such config on xml
<jms:annotation-driven />
<bean id="jmsListenerContainerFactory" class="org.springframework.jms.config.DefaultJmsListenerContainerFactory">
<property name="connectionFactory" ref="pooledConnectionFactory" />
<property name="concurrency" value="3-10" />
</bean>
<bean id="jmsConnectionFactory" class="org.apache.activemq.ActiveMQConnectionFactory">
<property name="brokerURL" value="${brokerURL}" />
</bean>
<bean id="pooledConnectionFactory" class="org.apache.activemq.pool.PooledConnectionFactory">
<property name="maxConnections" value="5" />
<property name="maximumActiveSessionPerConnection" value="500" />
<property name="connectionFactory" ref="jmsConnectionFactory" />
</bean>
<bean id="jmsContainerFactory" class="org.springframework.jms.config.DefaultJmsListenerContainerFactory">
<property name="connectionFactory" ref="pooledConnectionFactory" />
<property name="concurrency" value="3-10" />
</bean>
<bean id="jmsTemplate" class="org.springframework.jms.core.JmsTemplate" p:connectionFactory-ref="pooledConnectionFactory" />
Seems consumer was not created. I can send messages but cannot receive them.
Any idea whats wrong here?
Just tested your config and it works well. Only difference that I make a class with #JmsListener as a <bean> in that context:
<bean class="org.springframework.integration.jms.JmsListenerAnnotationTests$TestService"/>
#ContextConfiguration
#RunWith(SpringJUnit4ClassRunner.class)
#DirtiesContext
public class JmsListenerAnnotationTests {
#Autowired
private JmsTemplate jmsTemplate;
#Autowired
private TestService testService;
#Test
public void test() throws InterruptedException {
this.jmsTemplate.convertAndSend("myQName", "foo");
assertTrue(this.testService.receiveLatch.await(10, TimeUnit.SECONDS));
assertNotNull(this.testService.received);
assertEquals("foo", this.testService.received);
}
public static class TestService {
private CountDownLatch receiveLatch = new CountDownLatch(1);
private Object received;
#JmsListener(containerFactory = "jmsListenerContainerFactory", destination = "myQName")
public void receive(String message) {
this.received = message;
this.receiveLatch.countDown();
}
}
}
<jms:annotation-driven /> makes the #JmsListener infrastructure available, but to force Spring to see those methods your classes should be beans anyway.
For example <component-scan> for the package with #Service classes.
Cheers!

How to inject a dependency bean to GridCacheStore implementation?

My config:
<bean parent="cache-template">
<property name="name" value="yagoLabel" />
<property name="cacheMode" value="PARTITIONED" />
<property name="atomicityMode" value="TRANSACTIONAL" />
<property name="distributionMode" value="PARTITIONED_ONLY" />
<property name="backups" value="1" />
<property name="store">
<bean class="id.ac.itb.ee.lskk.lumen.yago.YagoLabelCacheStore" autowire="byType" init-method="init" />
</property>
<property name="writeBehindEnabled" value="true" />
<property name="writeBehindFlushSize" value="102380" />
<property name="writeBehindFlushFrequency" value="30000" />
<property name="writeBehindBatchSize" value="10240" />
<property name="swapEnabled" value="false" />
<property name="evictionPolicy">
<bean class="org.gridgain.grid.cache.eviction.lru.GridCacheLruEvictionPolicy">
<property name="maxSize" value="102400" />
</bean>
</property>
</bean>
And I start GridGain as follows:
My GridCacheStore implementation:
public class YagoLabelCacheStore extends GridCacheStoreAdapter<String, YagoLabel> {
private static final Logger log = LoggerFactory
.getLogger(YagoLabelCacheStore.class);
private DBCollection labelColl;
#GridSpringResource(resourceName="mongoDb")
private DB db;
#Inject
private GridGainSpring grid;
#PostConstruct
public void init() {
log.info("Grid is {}", grid);
labelColl = db.getCollection("label");
}
I start GridGain as follows:
String entityId = "Muhammad";
try (AnnotationConfigApplicationContext appCtx
= new AnnotationConfigApplicationContext(LumenConfig.class)) {
Grid grid = appCtx.getBean(Grid.class);
GridCache<String, YagoLabel> labelCache = YagoLabel.cache(grid);
log.info("Label for {}: {}", entityId, labelCache.get(entityId));
}
LumenConfig Spring configuration contains a DB bean named mongoDb.
However this throws NullPointerException because db is not injected properly. I tried #Inject GridGainSpring just for testing, and even GridGainSpring itself is not injected.
I also tried setting <property name="db" ref="mongoDb"/> in the GridGain Config XML but Spring complains cannot find the bean.
My workaround is to put it inside a public static field but that's soo hacky: https://github.com/ceefour/lumen-kb/blob/b8445fbebd227fb7ac337c758a60badb7ecd3095/cli/src/main/java/id/ac/itb/ee/lskk/lumen/yago/YagoLabelCacheStore.java
The way is to load the GridConfiguration using Spring, then pass it to GridGainSpring.start() :
// "classpath:" is required, otherwise it won't be found in a WAR
#ImportResource("classpath:id/ac/itb/ee/lskk/lumen/core/lumen.gridgain.xml")
#Configuration
public static class GridGainConfig {
#Inject
private ApplicationContext appCtx;
#Inject
private GridConfiguration gridCfg;
#Bean(destroyMethod="close")
public Grid grid() throws GridException {
return GridGainSpring.start(gridCfg, appCtx);
}
}
:-)

I want the current resource processed by MultiResourceItemReader available in beforeStep method

Is there any way to make the currentResource processed by MultiResourceItemReader to make it available in the beforeStep method.Kindly provide me a working code sample. I tried injected multiresourcereader reference in to stepexecutionlistener , but the spring cglib only accepts an interface type to be injected ,i dont know whether to use ItemReader or ItemStream interface.
The MultiResourceItemReader now has a method getCurrentResource() that returns the current Resource.
Retrieving of currentResource from MultiResourceItemReader is not possible at the moment. If you need this API enhancement, create one in Spring Batch JIRA.
Even if there is a getter for currentResource, it's value is not valid in beforeStep(). It is valid between open() and close().
it is possible, if you use a Partition Step and the Binding Input Data to Steps concept
simple code example, with concurrency limit 1 to imitate "serial" processing:
<bean name="businessStep:master" class="org.springframework.batch.core.partition.support.PartitionStep">
<property name="jobRepository" ref="jobRepository"/>
<property name="stepExecutionSplitter">
<bean class="org.springframework.batch.core.partition.support.SimpleStepExecutionSplitter">
<constructor-arg ref="jobRepository"/>
<constructor-arg ref="concreteBusinessStep"/>
<constructor-arg>
<bean class="org.spring...MultiResourcePartitioner" scope="step">
<property name="resources" value="#{jobParameters['input.file.pattern']}"/>
</bean>
</constructor-arg>
</bean>
</property>
<property name="partitionHandler">
<bean class="org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler">
<property name="taskExecutor">
<bean class="org.springframework.core.task.SimpleAsyncTaskExecutor">
<property name="concurrencyLimit" value="1" />
</bean>
</property>
<property name="step" ref="concreteBusinessStep"/>
</bean>
</property>
</bean>
<bean id="whateverClass" class="..." scope="step">
<property name="resource" value="#{stepExecutionContext['fileName']}" />
</bean>
example step configuration:
<job id="renameFilesPartitionJob">
<step id="businessStep"
parent="businessStep:master" />
</job>
<step id="concreteBusinessStep">
<tasklet>
<chunk reader="itemReader"
writer="itemWriter"
commit-interval="5" />
</tasklet>
</step>
potential drawbacks:
distinct steps for each file instead of one step
more complicated configuration
Using the getCurrentResource() method from MultiResourceItemReader, update the stepExecution. Example code as below
private StepExecution stepExecution;
#Override
public Resource getCurrentResource() {
this.stepExecution.getJobExecution().getExecutionContext().put("FILE_NAME", super.getCurrentResource().getFilename());
return super.getCurrentResource();
}
#BeforeStep
public void beforeStep(StepExecution stepExecution) {
this.stepExecution = stepExecution;
}
If you make the item you are reading implement ResourceAware, the current resource is set as it is read
public class MyItem implements ResourceAware {
private Resource resource;
//others
public void setResource(Resource resource) {
this.resource = resource;
}
public Resource getResource() {
return resource;
}
}
and in your reader, processor or writer
myItem.getResource()
will return the resource it was loaded from

Resources