Spring Batch: Reading from a JMS queue, step does not end - spring

I have a simple bach job which reads from a JMS queue (ActiveMQ) and writes to a file. The batch job runs as expected and writes to the file honoring the commit interval which has been set to 10,000.
There are 2 observations in this regard
The batch job reading queue does not end.
I see that all the messages from the queue have been consumed but the last chunk gets written to file only when new messages are pushed to the JMS queue and the commit interval is met.
Is it the expected behavior? I would like to schedule the batch job and consume and write all the messages present in the queue at that point of time. Any advise?
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Bean
public TransactionAwareConnectionFactoryProxy activeMQConnectionFactory() {
ActiveMQConnectionFactory amqConnectionFactory = new ActiveMQConnectionFactory(ActiveMQConnection.DEFAULT_BROKER_URL);
TransactionAwareConnectionFactoryProxy activeMQConnectionFactory = new TransactionAwareConnectionFactoryProxy(amqConnectionFactory);
return activeMQConnectionFactory;
}
#Bean
public ActiveMQQueue defaultQueue() {
return new ActiveMQQueue("firstQueue");
}
#Bean
public PlatformTransactionManager transactionManager() {
return new ResourcelessTransactionManager();
}
#Bean
public JobRepository jobRepository(PlatformTransactionManager transactionManager) throws Exception {
return new MapJobRepositoryFactoryBean(transactionManager).getObject();
}
#Bean
#DependsOn("jobRepository")
public SimpleJobLauncher simpleJobLauncher(JobRepository jobRepository) {
SimpleJobLauncher simpleJobLauncher = new SimpleJobLauncher();
simpleJobLauncher.setJobRepository(jobRepository);
return simpleJobLauncher;
}
If I set the receiveTimeout to a smaller number, all messages are not consumed, thus set to the upper limit.
#Bean
#DependsOn(value = { "activeMQConnectionFactory", "defaultQueue" })
public JmsTemplate firstQueueTemplate(ActiveMQQueue defaultQueue, TransactionAwareConnectionFactoryProxy activeMQConnectionFactory) {
JmsTemplate firstQueueTemplate = new JmsTemplate(activeMQConnectionFactory);
firstQueueTemplate.setDefaultDestination(defaultQueue);
firstQueueTemplate.setSessionTransacted(true);
firstQueueTemplate.setReceiveTimeout(Long.MAX_VALUE);
return firstQueueTemplate;
}
Config for the batch job.
#Bean
public JmsItemReader<String> jmsItemReader(JmsTemplate firstQueueTemplate) {
JmsItemReader<String> jmsItemReader = new JmsItemReader<>();
jmsItemReader.setJmsTemplate(firstQueueTemplate);
jmsItemReader.setItemType(String.class);
return jmsItemReader;
}
#Bean
public ItemWriter<String> flatFileItemWriter() {
FlatFileItemWriter<String> writer = new FlatFileItemWriter<>();
writer.setResource(new FileSystemResource("/mypath/output.csv"));
writer.setLineAggregator(new PassThroughLineAggregator<String>());
return writer;
}
#Bean
#DependsOn(value = { "jmsItemReader", "jmsItemWriter", "jobRepository", "transactionManager" })
public Step queueReaderStep(JmsItemReader<String> jmsItemReader, ItemWriter<String> flatFileItemWriter, JobRepository jobRepository,
PlatformTransactionManager transactionManager) throws Exception {
StepBuilderFactory stepBuilderFactory = new StepBuilderFactory(jobRepository, transactionManager);
AbstractTaskletStepBuilder<SimpleStepBuilder<String, String>> step = stepBuilderFactory.get("queueReaderStep").<String, String> chunk(10000)
.reader(jmsItemReader).writer(flatFileItemWriter);
return step.build();
}
#Bean
#DependsOn(value = { "jobRepository", "queueReaderStep" })
public Job jsmReaderJob(JobRepository jobRepository, Step queueReaderStep) {
return this.jobBuilderFactory.get("jsmReaderJob").repository(jobRepository).incrementer(new RunIdIncrementer())
.flow(queueReaderStep).end().build();
}

The JmsItemReader provided by Spring Batch is really meant as more of a template or example since, as you note, it never returns null so the step never ends. You'd need to write something to indicate that a given message indicated that the step was complete.

Related

How to use spring transaction support with Spring Batch

I am trying to use spring batch to read file from a .dat file and persist the data into database. My requirement says to either insert all of the data or insert none of the data into table i.e, atomicity. However, using spring batch i'm not able to achieve the same it is reading data in chunks and is inserting data as long as the records are fine. if at some point the record is inappropriate and some db exception is thrown then i want complete rollback which is not happening. Let's say we get error at 2051th record then my code saves 2050 records but i want complete rollback and if all data is good then all N records should be persisted. Thanks in advance for any help or relevant approach that may solve my issue...
NOTE: I have already used Spring Transactional annotation on caller method but it's not working and i'm reading data in a chunk size of 10 items.
MyConfiguration.java
#Configuration
public class MyConfiguration
{
#Autowired
JobBuilderFactory jobBuilderFactory;
#Autowired
StepBuilderFactory stepBuilderFactory;
#Autowired
#Qualifier("MyCompletionListener")
JobCompletionNotificationListener jobCompletionNotificationListener;
#StepScope
#Bean(name="MyReader")
public FlatFileItemReader<InputMapperDTO> reader(#Value("#{jobParameters['fileName']}") String fileName) throws IOException
{
FlatFileItemReader<InputMapperDTO> newBean = new FlatFileItemReader<>();
newBean.setName("MyReader");
newBean.setResource(new InputStreamResource(FileUtils.openInputStream(new File(fileName))));
newBean.setLineMapper(lineMapper());
newBean.setLinesToSkip(1);
return newBean;
}
#Bean(name="MyLineMapper")
public DefaultLineMapper<InputMapperDTO> lineMapper()
{
DefaultLineMapper<InputMapperDTO> lineMapper = new DefaultLineMapper<>();
lineMapper.setLineTokenizer(lineTokenizer());
Reader reader = new Reader();
lineMapper.setFieldSetMapper(reader);
return lineMapper;
}
#Bean(name="MyTokenizer")
public DelimitedLineTokenizer lineTokenizer()
{
DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
tokenizer.setDelimiter("|");
tokenizer.setNames("InvestmentAccountUniqueIdentifier", "BaseCurrencyUniqueIdentifier",
"OperatingCurrencyUniqueIdentifier", "PricingHierarchyUniqueIdentifier", "InvestmentAccountNumber",
"DummyAccountIndicator", "InvestmentAdvisorCompanyNumberLegacy","HighNetWorthAccountTypeCode");
tokenizer.setIncludedFields(0, 5, 7, 13, 29, 40, 49,75);
return tokenizer;
}
#Bean(name="MyBatchProcessor")
public ItemProcessor<InputMapperDTO, FinalDTO> processor()
{
return new Processor();
}
#Bean(name="MyWriter")
public ItemWriter<FinalDTO> writer()
{
return new Writer();
}
#Bean(name="MyStep")
public Step step1() throws IOException
{
return stepBuilderFactory.get("MyStep")
.<InputMapperDTO, FinalDTO>chunk(10)
.reader(this.reader(null))
.processor(this.processor())
.writer(this.writer())
.build();
}
#Bean(name=MyJob")
public Job importUserJob(#Autowired #Qualifier("MyStep") Step step1)
{
return jobBuilderFactory
.get("MyJob"+new Date())
.incrementer(new RunIdIncrementer())
.listener(jobCompletionNotificationListener)
.flow(step1)
.end()
.build();
}
}
Writer.java
public class Writer implements ItemWriter<FinalDTO>
{
#Autowired
SomeRepository someRepository;
#Override
public void write(List<? extends FinalDTO> listOfObjects) throws Exception
{
someRepository.saveAll(listOfObjects);
}
}
JobCompletionNotificationListener.java
public class JobCompletionNotificationListener extends JobExecutionListenerSupport
{
#Override
public void afterJob(JobExecution jobExecution)
{
if(jobExecution.getStatus() == BatchStatus.COMPLETED)
{
System.err.println("****************************************");
System.err.println("***** Batch Job Completed ******");
System.err.println("****************************************");
}
else
{
System.err.println("****************************************");
System.err.println("***** Batch Job Failed ******");
System.err.println("****************************************");
}
}
}
MyCallerMethod
#Transactional
public String processFile(String datFile) throws JobExecutionAlreadyRunningException, JobRestartException,
JobInstanceAlreadyCompleteException, JobParametersInvalidException
{
long st = System.currentTimeMillis();
JobParametersBuilder builder = new JobParametersBuilder();
builder.addString("fileName",datFile);
builder.addDate("date", new Date());
jobLauncher.run(job, builder.toJobParameters());
System.err.println("****************************************");
System.err.println("***** Total time consumed = "+(System.currentTimeMillis()-st)+" ******");
System.err.println("****************************************");
return response;
}
The operation I have tried is not provided in batch. For my requirement, I have implemented custom delete which flushes the database upon failure in any step.

How to reset MultiResourceItemReader for each job run . Step scope not working

How can I initilize the MultiResourceItemReader for each job run . currently with this setup its still using the same instance for each job run
I put the #StepScope still its using the same old list of files which it has already been processed. I am not sure what else I have to add in this code
I also tried with the #JobScope also it did not work out. there is something fundamental I am missing
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Value("file:ftp-inbound/*.csv")
#Autowired
private Resource[] inputResources;
#Autowired
private StepBuilderFactory steps;
#Autowired
private JobBuilderFactory jobs;
#Autowired
private ResourceLoader resourceLoader;
#Bean
public FlatFileItemReader<AccommodationRoomAvailability> itemReader() throws UnexpectedInputException, ParseException, IOException {
FlatFileItemReader<AccommodationRoomAvailability> reader = new FlatFileItemReader<AccommodationRoomAvailability>();
DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
String[] tokens = {"Product ID", "Allotment", "Kamertype", "Zoeknaam", "Hotel", "Datum", "Beschikbaar", "Nachten"};
tokenizer.setNames(tokens);
tokenizer.setDelimiter(";");
tokenizer.setStrict(true);
reader.setLinesToSkip(1);
DefaultLineMapper<AccommodationRoomAvailability> lineMapper = new DefaultLineMapper<AccommodationRoomAvailability>();
lineMapper.setLineTokenizer(tokenizer);
lineMapper.setFieldSetMapper(new RecordFieldSetMapper());
reader.setLineMapper(lineMapper);
return reader;
}
#Bean
#Qualifier("multiResourceReader")
#StepScope
public MultiResourceItemReader<AccommodationRoomAvailability> multiResourceItemReader() throws Exception {
MultiResourceItemReader<AccommodationRoomAvailability> resourceItemReader = new MultiResourceItemReader<AccommodationRoomAvailability>();
resourceItemReader.setResources(inputResources);
resourceItemReader.setDelegate(itemReader());
resourceItemReader.setStrict(false);
resourceItemReader.setSaveState(false);
// resourceItemReader.read();
return resourceItemReader;
}
#Bean
public ItemProcessor<AccommodationRoomAvailability, String> itemProcessor() {
return new AvailabilityProcessor();
}
#Bean
public ItemWriter itemWriter() {
return new ItemWriter() {
#Override
public void write(List list) throws Exception {
}
};
}
#Bean
protected Step step1(#Qualifier("multiResourceReader") MultiResourceItemReader<AccommodationRoomAvailability> reader, ItemProcessor<AccommodationRoomAvailability, String> processor,
ItemWriter writer) {
return steps.get("step1")/*.listener(new StepListener())*/.<AccommodationRoomAvailability, String>chunk(30000).reader(reader)
.processor(processor)
.writer(writer)
.build();
}
#Bean
public Step step2() throws IOException {
FileDeletingTasklet task = new FileDeletingTasklet();
task.setResources(inputResources);
return stepBuilderFactory.get("step2")
.tasklet(task)
.build();
}
#Bean(name = "job")
public Job job(#Qualifier("step1") Step step1, Step step2) throws IOException {
return jobs.get("job")
.start(step1).on("*").to(step2).end()
// .flow(step1).on("").to(step2()).end()
.build();
}
}
Once your application context is created, the injected resources #Value("file:ftp-inbound/*.csv") will be the same during the whole lifetime of your app. That's why the reader will always read the same values.
You need to pass these resources as a parameter to your job and late-bind them in your reader with Step scope. In your example it would be something like:
#Bean
#Qualifier("multiResourceReader")
#StepScope
public MultiResourceItemReader<AccommodationRoomAvailability> multiResourceItemReader(#Value("#{jobParameters['inputResources']}") Resource[] inputResources) throws Exception {
MultiResourceItemReader<AccommodationRoomAvailability> resourceItemReader = new MultiResourceItemReader<AccommodationRoomAvailability>();
resourceItemReader.setResources(inputResources);
resourceItemReader.setDelegate(itemReader());
resourceItemReader.setStrict(false);
resourceItemReader.setSaveState(false);
return resourceItemReader;
}
Then pass input resources as a parameter to your job:
JobParameters jobParameters = new JobParametersBuilder()
.addString("inputResources", "file:ftp-inbound/*.csv")
.toJobParameters();
currently with this setup its still using the same instance for each job run
That's because your resources are always the same when they are injected in a field of your configuration class. If you use the job parameters approach I mentioned in the previous example, you will have a different instance if you run the job with different set of files.

Spring batch JdbcPagingItemReader not able to read all events

I had spring batch application like below (table name and query are edited for some general names)
when i execute this program, it was able to read 7500 events , i.e 3 times of chunk size and not able to read remaining records in oracle database. I had a table contain 50 million records and able to copy to another noSql database.
#EnableBatchProcessing
#SpringBootApplication
#EnableAutoConfiguration
public class MultiThreadPagingApp extends DefaultBatchConfigurer{
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
public DataSource dataSource;
#Bean
public DataSource dataSource() {
final DriverManagerDataSource dataSource = new DriverManagerDataSource();
dataSource.setDriverClassName("oracle.jdbc.OracleDriver");
dataSource.setUrl("jdbc:oracle:thin:#***********");
dataSource.setUsername("user");
dataSource.setPassword("password");
return dataSource;
}
#Override
public void setDataSource(DataSource dataSource) {}
#Bean
#StepScope
ItemReader<UserModel> dbReader() throws Exception {
JdbcPagingItemReader<UserModel> reader = new JdbcPagingItemReader<UserModel>();
final SqlPagingQueryProviderFactoryBean sqlPagingQueryProviderFactoryBean = new SqlPagingQueryProviderFactoryBean();
sqlPagingQueryProviderFactoryBean.setDataSource(dataSource);
sqlPagingQueryProviderFactoryBean.setSelectClause("select * ");
sqlPagingQueryProviderFactoryBean.setFromClause("from user");
sqlPagingQueryProviderFactoryBean.setWhereClause("where id>0");
sqlPagingQueryProviderFactoryBean.setSortKey("name");
reader.setQueryProvider(sqlPagingQueryProviderFactoryBean.getObject());
reader.setDataSource(dataSource);
reader.setPageSize(2500);
reader.setRowMapper(new BeanPropertyRowMapper<>(UserModel.class));
reader.afterPropertiesSet();
reader.setSaveState(true);
System.out.println("Reading users anonymized in chunks of {}"+ 2500);
return reader;
}
#Bean
public Dbwriter writer() {
return new Dbwriter(); // I had another class for this
}
#Bean
public Step step1() throws Exception {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setCorePoolSize(4);
taskExecutor.setMaxPoolSize(10);
taskExecutor.afterPropertiesSet();
return this.stepBuilderFactory.get("step1")
.<UserModel, UserModel>chunk(2500)
.reader(dbReader())
.writer(writer())
.taskExecutor(taskExecutor)
.build();
}
#Bean
public Job multithreadedJob() throws Exception {
return this.jobBuilderFactory.get("multithreadedJob")
.start(step1())
.build();
}
#Bean
public PlatformTransactionManager getTransactionManager() {
return new ResourcelessTransactionManager();
}
#Bean
public JobRepository getJobRepo() throws Exception {
return new MapJobRepositoryFactoryBean(getTransactionManager()).getObject();
}
public static void main(String[] args) {
SpringApplication.run(MultiThreadPagingApp.class, args);
}
}
Can you help me how can i efficiently read all the records using spring batch, or help me any other approach to handle this. I had tried one approch mentioned here : http://techdive.in/java/jdbc-handling-huge-resultset
its taken 120 mins to read and save all records with single thread application. Since spring batch is best fit for this, I assume we can handle this scenario in quick time.
You are setting the saveState flag to true (BTW, it should be set before calling afterPropertiesSet) on a JdbcPagingItemReader and using this reader in a multithreaded step. However, it is documented to set this flag to false in a multi-threaded context.
Multi-threading with database readers is usually not the best option, I would recommend to use partitioning in your case.
I had the same problem, i fix it by changing my sortKey. I realize that the previous one wasn't different for every data records. So i replace it with ID whitch were different for every records of the Database

How can i restrict duplicate job creation using spring boot and spring batch?

I created a Spring Boot with Spring Batch Application and Scheduling. When i create only one job, things are working fine . But when i try to create another job using the modular approach. The jobs and it's step are running many times and they are getting duplicated.
Getting the below error.
2017-08-24 16:05:00.581 INFO 16172 --- [cTaskExecutor-2] o.s.b.c.l.support.SimpleJobLauncher : Job: [FlowJob: [name=importDeptJob]] completed with the following parameters: [{JobID1=1503579900035}] and the following status: [FAILED]
2017-08-24 16:05:00.581 ERROR 16172 --- [cTaskExecutor-2] o.s.batch.core.step.tasklet.TaskletStep : JobRepository failure forcing rollback
org.springframework.dao.OptimisticLockingFailureException: Attempt to update step execution id=1 with wrong version (3), where current version is 1
Can anyone Please guide me how to resolve these issues and run the jobs in a parallel way independent of each other ?
Below are the configuration Classes : ModularJobConfiguration.java , DeptBatchConfiguration.java ,CityBatchConfiguration.java and BatchScheduler.java
#Configuration
#EnableBatchProcessing(modular=true)
public class ModularJobConfiguration {
#Bean
public ApplicationContextFactory firstJob() {
return new GenericApplicationContextFactory(DeptBatchConfiguration.class);
}
#Bean
public ApplicationContextFactory secondJob() {
return new GenericApplicationContextFactory(CityBatchConfiguration.class);
}
}
#Configuration
#EnableBatchProcessing
#Import({BatchScheduler.class})
public class DeptBatchConfiguration {
private static final Logger LOGGER = LoggerFactory.getLogger(DeptBatchConfiguration.class);
#Autowired
private SimpleJobLauncher jobLauncher;
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
public JobExecutionListener listener;
public ItemReader<DepartmentModelReader> deptReaderSO;
#Autowired
#Qualifier("dataSourceReader")
private DataSource dataSourceReader;
#Autowired
#Qualifier("dataSourceWriter")
private DataSource dataSourceWriter;
#Scheduled(cron = "0 0/1 * * * ?")
public void performFirstJob() throws Exception {
long startTime = System.currentTimeMillis();
LOGGER.info("Job1 Started at :" + new Date());
JobParameters param = new JobParametersBuilder().addString("JobID1",String.valueOf(System.currentTimeMillis())).toJobParameters();
JobExecution execution = (JobExecution) jobLauncher.run(importDeptJob(jobBuilderFactory,stepdept(deptReaderSO,customWriter()),listener), param);
long endTime = System.currentTimeMillis();
LOGGER.info("Job1 finished at " + (endTime - startTime) / 1000 + " seconds with status :" + execution.getExitStatus());
}
#Bean
public ItemReader<DepartmentModelReader> deptReaderSO() {
//LOGGER.info("Inside deptReaderSO Method");
JdbcCursorItemReader<DepartmentModelReader> deptReaderSO = new JdbcCursorItemReader<>();
//deptReaderSO.setSql("select id, firstName, lastname, random_num from reader");
deptReaderSO.setSql("SELECT DEPT_CODE,DEPT_NAME,FULL_DEPT_NAME,CITY_CODE,CITY_NAME,CITY_TYPE_NAME,CREATED_USER_ID,CREATED_G_DATE,MODIFIED_USER_ID,MODIFIED_G_DATE,RECORD_ACTIVITY,DEPT_CLASS,DEPT_PARENT,DEPT_PARENT_NAME FROM TBL_SAMPLE_SAFTY_DEPTS");
deptReaderSO.setDataSource(dataSourceReader);
deptReaderSO.setRowMapper(
(ResultSet resultSet, int rowNum) -> {
if (!(resultSet.isAfterLast()) && !(resultSet.isBeforeFirst())) {
DepartmentModelReader recordSO = new DepartmentModelReader();
recordSO.setDeptCode(resultSet.getString("DEPT_CODE"));
recordSO.setDeptName(resultSet.getString("DEPT_NAME"));
recordSO.setFullDeptName(resultSet.getString("FULL_DEPT_NAME"));
recordSO.setCityCode(resultSet.getInt("CITY_CODE"));
recordSO.setCityName(resultSet.getString("CITY_NAME"));
recordSO.setCityTypeName(resultSet.getString("CITY_TYPE_NAME"));
recordSO.setCreatedUserId(resultSet.getInt("CREATED_USER_ID"));
recordSO.setCreatedGDate(resultSet.getDate("CREATED_G_DATE"));
recordSO.setModifiedUserId(resultSet.getString("MODIFIED_USER_ID"));
recordSO.setModifiedGDate(resultSet.getDate("MODIFIED_G_DATE"));
recordSO.setRecordActivity(resultSet.getInt("RECORD_ACTIVITY"));
recordSO.setDeptClass(resultSet.getInt("DEPT_CLASS"));
recordSO.setDeptParent(resultSet.getString("DEPT_PARENT"));
recordSO.setDeptParentName(resultSet.getString("DEPT_PARENT_NAME"));
// LOGGER.info("RowMapper record : {}", recordSO.getDeptCode() +" | "+recordSO.getDeptName());
return recordSO;
} else {
LOGGER.info("Returning null from rowMapper");
return null;
}
});
return deptReaderSO;
}
#Bean
public ItemProcessor<DepartmentModelReader, DepartmentModelWriter> processor() {
//LOGGER.info("Inside Processor Method");
return new RecordProcessor();
}
#Bean
public ItemWriter<DepartmentModelWriter> customWriter(){
//LOGGER.info("Inside customWriter Method");
return new CustomItemWriter();
}
#Bean
public Job importDeptJob(JobBuilderFactory jobs, Step stepdept,JobExecutionListener listener){
return jobs.get("importDeptJob")
.incrementer(new RunIdIncrementer())
.listener(listener())
.flow(stepdept).end().build();
}
#Bean
public Step stepdept(ItemReader<DepartmentModelReader> deptReaderSO,
ItemWriter<DepartmentModelWriter> writerSO) {
LOGGER.info("Inside stepdept Method");
return stepBuilderFactory.get("stepdept").<DepartmentModelReader, DepartmentModelWriter>chunk(5)
.reader(deptReaderSO).processor(processor()).writer(customWriter()).transactionManager(platformTransactionManager(dataSourceWriter)).build();
}
#Bean
public JobExecutionListener listener() {
return new JobCompletionNotificationListener();
}
#Bean
public JdbcTemplate jdbcTemplate(DataSource dataSource) {
return new JdbcTemplate(dataSource);
}
#Bean
public BatchWriteService batchWriteService() {
return new BatchWriteService();
}
#Bean
public PlatformTransactionManager platformTransactionManager(#Qualifier("dataSourceWriter") DataSource dataSourceWriter) {
JpaTransactionManager transactionManager = new JpaTransactionManager();
transactionManager.setDataSource(dataSourceWriter);
return transactionManager;
}
}
#Configuration
#EnableBatchProcessing
#Import({BatchScheduler.class})
public class CityBatchConfiguration {
private static final Logger LOGGER = LoggerFactory.getLogger(CityBatchConfiguration.class);
#Autowired
private SimpleJobLauncher jobLauncher;
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
public JobExecutionListener listener;
public ItemReader<CitiesModelReader> citiesReaderSO;
#Autowired
#Qualifier("dataSourceReader")
private DataSource dataSourceReader;
#Autowired
#Qualifier("dataSourceWriter")
private DataSource dataSourceWriter;
#Scheduled(cron = "0 0/1 * * * ?")
public void performSecondJob() throws Exception {
long startTime = System.currentTimeMillis();
LOGGER.info("\n Job2 Started at :" + new Date());
JobParameters param = new JobParametersBuilder().addString("JobID2",String.valueOf(System.currentTimeMillis())).toJobParameters();
JobExecution execution = (JobExecution) jobLauncher.run(importCitiesJob(jobBuilderFactory,stepcity(citiesReaderSO,customCitiesWriter()),listener), param);
long endTime = System.currentTimeMillis();
LOGGER.info("Job2 finished at " + (endTime - startTime) / 1000 + " seconds with status :" + execution.getExitStatus());
}
#Bean
public ItemReader<CitiesModelReader> citiesReaderSO() {
//LOGGER.info("Inside readerSO Method");
JdbcCursorItemReader<CitiesModelReader> readerSO = new JdbcCursorItemReader<>();
readerSO.setSql("SELECT CITY_CODE,CITY_NAME,PARENT_CITY,CITY_TYPE,CITY_TYPE_NAME,CREATED_G_DATE,CREATED_USER_ID,MODIFIED_G_DATE,MODIFIED_USER_ID,RECORD_ACTIVITY FROM TBL_SAMPLE_SAFTY_CITIES");
readerSO.setDataSource(dataSourceReader);
readerSO.setRowMapper(
(ResultSet resultSet, int rowNum) -> {
if (!(resultSet.isAfterLast()) && !(resultSet.isBeforeFirst())) {
CitiesModelReader recordSO = new CitiesModelReader();
recordSO.setCityCode(resultSet.getLong("CITY_CODE"));
recordSO.setCityName(resultSet.getString("CITY_NAME"));
recordSO.setParentCity(resultSet.getInt("PARENT_CITY"));
recordSO.setCityType(resultSet.getString("CITY_TYPE"));
recordSO.setCityTypeName(resultSet.getString("CITY_TYPE_NAME"));
recordSO.setCreatedGDate(resultSet.getDate("CREATED_G_DATE"));
recordSO.setCreatedUserId(resultSet.getString("CREATED_USER_ID"));
recordSO.setModifiedGDate(resultSet.getDate("MODIFIED_G_DATE"));
recordSO.setModifiedUserId(resultSet.getString("MODIFIED_USER_ID"));
recordSO.setRecordActivity(resultSet.getInt("RECORD_ACTIVITY"));
//LOGGER.info("RowMapper record : {}", recordSO.toString());
return recordSO;
} else {
LOGGER.info("Returning null from rowMapper");
return null;
}
});
return readerSO;
}
#Bean
public ItemProcessor<CitiesModelReader,CitiesModelWriter> citiesProcessor() {
//LOGGER.info("Inside Processor Method");
return new RecordCitiesProcessor();
}
#Bean
public ItemWriter<CitiesModelWriter> customCitiesWriter(){
LOGGER.info("Inside customCitiesWriter Method");
return new CustomCitiesWriter();
}
#Bean
public Job importCitiesJob(JobBuilderFactory jobs, Step stepcity,JobExecutionListener listener) {
LOGGER.info("Inside importCitiesJob Method");
return jobs.get("importCitiesJob")
.incrementer(new RunIdIncrementer())
.listener(listener())
.flow(stepcity).end().build();
}
#Bean
public Step stepcity(ItemReader<CitiesModelReader> readerSO,
ItemWriter<CitiesModelWriter> writerSO) {
LOGGER.info("Inside stepCity Method");
return stepBuilderFactory.get("stepcity").<CitiesModelReader, CitiesModelWriter>chunk(5)
.reader(readerSO).processor(citiesProcessor()).writer(customCitiesWriter()).transactionManager(platformTransactionManager(dataSourceWriter)).build();
}
#Bean
public JobExecutionListener listener() {
return new JobCompletionNotificationListener();
}
#Bean
public JdbcTemplate jdbcTemplate(DataSource dataSource) {
return new JdbcTemplate(dataSource);
}
#Bean
public BatchWriteService batchWriteService() {
return new BatchWriteService();
}
#Bean
public PlatformTransactionManager platformTransactionManager(#Qualifier("dataSourceWriter") DataSource dataSourceWriter) {
JpaTransactionManager transactionManager = new JpaTransactionManager();
transactionManager.setDataSource(dataSourceWriter);
return transactionManager;
}
}
#Configuration
#EnableScheduling
public class BatchScheduler {
private static final Logger LOGGER = LoggerFactory.getLogger(BatchScheduler.class);
#Bean
public ResourcelessTransactionManager resourcelessTransactionManager() {
return new ResourcelessTransactionManager();
}
#Bean
public MapJobRepositoryFactoryBean mapJobRepositoryFactory(
ResourcelessTransactionManager txManager) throws Exception {
LOGGER.info("Inside mapJobRepositoryFactory method");
MapJobRepositoryFactoryBean factory = new
MapJobRepositoryFactoryBean(txManager);
factory.afterPropertiesSet();
return factory;
}
#Bean
public JobRepository jobRepository(
MapJobRepositoryFactoryBean factory) throws Exception {
LOGGER.info("Inside jobRepository method");
return factory.getObject();
}
#Bean
public SimpleJobLauncher jobLauncher(JobRepository jobRepository) {
LOGGER.info("Inside jobLauncher method");
SimpleJobLauncher launcher = new SimpleJobLauncher();
launcher.setJobRepository(jobRepository);
final SimpleAsyncTaskExecutor simpleAsyncTaskExecutor = new SimpleAsyncTaskExecutor();
launcher.setTaskExecutor(simpleAsyncTaskExecutor);
return launcher;
}
}
The map-based SimpleJobRepository created from MapJobRepositoryFactoryBean is not thread-safe.
From the Javadocs:
A FactoryBean that automates the creation of a SimpleJobRepository using non-persistent in-memory DAO implementations. This repository is only really intended for use in testing and rapid prototyping. In such settings you might find that ResourcelessTransactionManager is useful (as long as your business logic does not use a relational database). Not suited for use in multi-threaded jobs with splits, although it should be safe to use in a multi-threaded step.
You can create a JDBC-based SimpleJobRepository from JobRepositoryFactoryBean, which may utilize an in-memory H2 database if you don't require that Batch metadata be persisted.
Since you are using Spring Boot, to use an H2-backed JobRepository simply remove your JobRepository bean and add the following dependency to your pom.xml file:
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<scope>runtime</scope>
</dependency>
Spring Boot will automatically configure a DataSource as though you had configured the following in your application.properties file and automatically use that DataSource in the creation of a JobRepository.
spring.datasource.url=jdbc:h2:mem:testdb
spring.datasource.driverClassName=org.h2.Driver
spring.datasource.username=sa
spring.datasource.password=
Alternatively, to use some other JDBC-backed JobRepository add the JDBC dependencies for your RDBMS of choice to your project, and configure a DataSource for it (either in code as a DataSource bean, or in application.properties using the spring.datasource prefix as shown above). Spring Boot will automatically use this DataSource during the creation of the JobRepository bean.

Spring batch with Spring Boot terminates before children process with AsyncItemProcessor

I'm using Spring Batch with a AsyncItemProcessor and things are behaving unexpectedly. Let me show first the code:
Followed a simple example as shown on the Spring Batch project:
#EnableBatchProcessing
#SpringBootApplication
#Import({HttpClientConfigurer.class, BatchJobConfigurer.class})
public class PerfilEletricoApp {
public static void main(String[] args) throws Exception {// NOSONAR
System.exit(SpringApplication.exit(SpringApplication.run(PerfilEletricoApp.class, args)));
//SpringApplication.run(PerfilEletricoApp.class, args);
}
}
-- EDIT
If I just sleep the main process go give a few seconds to slf4j to write the flush the logs, everything works as expected.
#EnableBatchProcessing
#SpringBootApplication
#Import({HttpClientConfigurer.class, BatchJobConfigurer.class})
public class PerfilEletricoApp {
public static void main(String[] args) throws Exception {// NOSONAR
//System.exit(SpringApplication.exit(SpringApplication.run(PerfilEletricoApp.class, args)));
ConfigurableApplicationContext context = SpringApplication.run(PerfilEletricoApp.class, args);
Thread.sleep(1000 * 5);
System.exit(SpringApplication.exit(context));
}
}
-- ENDOF EDIT
I'm reading a text file with a field and then using a AsyncItemProcessor to get a multithreaded processing, which consists of a Http GET on a URL to fetch some data, I'm also using a NoOpWriter to do nothing on the write part. I'm saving the results of the GET on the Processor part of the job (using log.trace / log.warn).
#Configuration
public class HttpClientConfigurer {
// [... property and configs omitted]
#Bean
public CloseableHttpClient createHttpClient() {
// ... creates and returns a poolable http client etc
}
}
As for the Job:
#Configuration
public class BatchJobConfigurer {
#Autowired
private JobBuilderFactory jobs;
#Autowired
private StepBuilderFactory steps;
#Value("${async.tps:10}")
private Integer tps;
#Value("${com.bemobi.perfilelerico.sourcedir:/AppServer/perfil-eletrico/source-dir/}")
private String sourceDir;
#Bean
public ItemReader<String> reader() {
MultiResourceItemReader<String> reader = new MultiResourceItemReader<>();
reader.setResources( new Resource[] { new FileSystemResource(sourceDir)});
reader.setDelegate((ResourceAwareItemReaderItemStream<? extends String>) flatItemReader());
return reader;
}
#Bean
public ItemReader<String> flatItemReader() {
FlatFileItemReader<String> itemReader = new FlatFileItemReader<>();
itemReader.setLineMapper(new DefaultLineMapper<String>() {{
setLineTokenizer(new DelimitedLineTokenizer() {{
setNames(new String[] { "sample-field-001"});
}});
setFieldSetMapper(new SimpleStringFieldSetMapper<>());
}});
return itemReader;
}
#Bean
public ItemProcessor asyncItemProcessor(){
AsyncItemProcessor<String, OiPaggoResponse> asyncItemProcessor = new AsyncItemProcessor<>();
asyncItemProcessor.setDelegate(processor());
asyncItemProcessor.setTaskExecutor(getAsyncExecutor());
return asyncItemProcessor;
}
#Bean
public ItemProcessor<String,OiPaggoResponse> processor(){
return new PerfilEletricoItemProcessor();
}
/**
* Using a NoOpItemWriter<T> so we satisfy spring batch flow but don't use writer for anything else.
* #return a NoOpItemWriter<OiPaggoResponse>
*/
#Bean
public ItemWriter<OiPaggoResponse> writer() {
return new NoOpItemWriter<>();
}
#Bean
protected Step step1() throws Exception {
/*
Problem starts here, If Use the processor() everything ends nicely, but if I insist on the asyncItemProcessor(), the job ends and the logs from processor are not stored on the disk.
*/
return this.steps.get("step1").<String, OiPaggoResponse> chunk(10)
.reader(reader())
.processor(asyncItemProcessor())
.build();
}
#Bean
public Job job() throws Exception {
return this.jobs.get("consulta-perfil-eletrico").start(step1()).build();
}
#Bean(name = "asyncExecutor")
public TaskExecutor getAsyncExecutor()
{
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(tps);
executor.setMaxPoolSize(tps);
executor.setQueueCapacity(tps * 1000);
executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
executor.setThreadNamePrefix("AsyncExecutor-");
return executor;
}
}
-- UPDATED WITH AsyncItemWriter (Working version)
/*Wrapped Writer*/
#Bean
public ItemWriter asyncItemWriter(){
AsyncItemWriter<OiPaggoResponse> asyncItemWriter = new AsyncItemWriter<>();
asyncItemWriter.setDelegate(writer());
return asyncItemWriter;
}
/*AsyncItemWriter defined on the steps*/
#Bean
protected Step step1() throws Exception {
return this.steps.get("step1").<String, OiPaggoResponse> chunk(10)
.reader(reader())
.processor(asyncItemProcessor())
.writer(asyncItemWriter())
.build();
}
--
Any thoughts on why the AsyncItemProcessor don't wait for all the children to to complete before send a OK-Completed signal to the context?
The issue is that the AsyncItemProcessor is creating Futures that no one is waiting for. Wrap your NoOpItemWriter in the AsyncItemWriter so that someone is waiting for the Futures. That will cause the job to complete as expected.

Resources