How to write a tenantaware RepositoryItemReader in Spring batch? - spring

I have a job configured to run based on the job parameters and integrated with spring web and quartz to invoke based on demand and cron based. I am using RepositoryItemReader to take advantage of spring data. This is running as expected.
Now I want to introduce multi tenancy in the job. I have 3 tenants with different databases say tenant1, tenant2 and tenant3. Basically i want to run the batch job picking the data from the database based on the jobparameter. If the jobparameter is tenant1, i want to pick the data from the tenant1 database.
I have found an article on how to introduce multi tenancy in spring boot application here. https://www.baeldung.com/multitenancy-with-spring-data-jpa
The problem is that i am not able to understand where i could inject the context into the thread as i am using an AsyncTaskScheduler to launch a job and there are other jobs which are also registered in the context.
JobParameters jobParameters = new JobParametersBuilder()
.addString("tenantId",tenantId)
.addString("jobName",jobName)
.addLong("time", System.currentTimeMillis()).toJobParameters();
Job job = jobRegistry.getJob(jobName);
JobExecution jobExecution = asyncJobLauncher.run(job, jobParameters);
My itemReader bean is described as
#StepScope
#Bean
public ItemReader<Person> itemReader() {
return new RepositoryItemReaderBuilder<Person>()
.name("ItemReader")
.repository(personRepository)
.arguments("personName").methodName("findByPersonNameEquals")
.maxItemCount(30).pageSize(5)
.sorts(Collections.singletonMap("createTs", Sort.Direction.ASC)).build();
}

I discovered a work around for the problem by
Extending the RepositoryItemReader something like TenantAwareRepositoryItemReader which takes tenant as contructor arg.
Override the doPageRead() function in TenantAwareRepositoryItemReader by
Setting the tenantId in the threadcontext
Calling the super.doPageRead()
Clear the db thread context
Use the TenantAwareRepositoryItemReader as a itemReader.

Related

Nested transaction in SpringBatch tasklet not working

I'm using SpringBatch for my app. In one of the batch jobs, I need to process multiple data. Each data requires several database updates. And I need to make one transaction for one data. Meaning, if when processing one data an exception is thrown, database updates are rolled back for that data, then keep processing the next data.
I've put all database updates in one method in service layer. In my springbatch tasklet, I call that method for each data, like this;
for (RequestViewForBatch request : requestList) {
orderService.processEachRequest(request);
}
In the service class the method is like this;
Transactional(propagation = Propagation.NESTED, timeout = 100, rollbackFor = Exception.class)
public void processEachRequest(RequestViewForBatch request) {
//update database
}
When executing the task, it gives me this error message
org.springframework.transaction.NestedTransactionNotSupportedException: Transaction manager does not allow nested transactions by default - specify 'nestedTransactionAllowed' property with value 'true'
but i don't know how to solve this error.
Any suggestion would be appreciated. Thanks in advance.
The tasklet step will be executed in a transaction driven by Spring Batch. You need to remove the #Transactional on your processEachRequest method.
You would need a fault-tolerant chunk-oriented step configured with a skip policy. In this case, only faulty items will be skipped. Please refer to the Configuring Skip Logic section of the documentation. You can find an example here.

How to dynamically schedule multiple tasks in Spring Boot

I want to dynamically schedule a task based on the user input in a given popup.
The user should be able to schedule multiple tasks and each tasks should be a repeteable task.
I have tried to follow some of the possibilities offered by spring boot using espmale below:
example 1: https://riteshshergill.medium.com/dynamic-task-scheduling-with-spring-boot-6197e66fec42
example 2: https://www.baeldung.com/spring-task-scheduler#threadpooltaskscheduler
The Idea of example 1 is to send a http post request that should then invoke a schudeled task as below :
Each http call will lead to console print as below :
But I still not able to reach the needed behaviour; what I get as result is the task1 executed when invoked by action1 but as soon as a task2 is executed by an action2 the first task1 will stop executing .
Any idea how the needed logic could be implemented?
Example 1 demonstrates how to schedule a task based on requested rest api and Example 2 shows how to create ThreadPoolTaskScheduler for TaskScheduler. But you miss an important point, here. Even if you created thread pool, TaskScheduler is not aware of that and thus, it needs to be configured so that it can use thread pool. For that, use SchedulingConfigurer interface. Here is an example:
#Configuration
#EnableScheduling
public class TaskConfigurer implements SchedulingConfigurer {
#Override
public void configureTasks(ScheduledTaskRegistrar taskRegistrar) {
//Create your ThreadPoolTaskScheduler here.
}
}
After creating such configuration class, everything should work fine.

Execution time of a spring boot job

I am trying to calculate the total time taken by a spring batch job. I have used spring boot to trigger the job. Before spring boot triggers the spring batch job, datasource and other beans required by job are configured by spring boot which consumes some time. Should i consider this time also to calculate total amount of time taken by spring boot job for execution as it uses the datasources and beans configured by spring boot?
Should i consider this time also to calculate total amount of time
taken by spring boot job
Simple answer is NO. Unless you reconnect the datasource and refresh the configuration on each execution of the Job.
When you say, your application (boot or batch) is up and ready for execution, it mean all the components are initialised, dependencies are resolved, connections are made and it is just waiting for a task/trigger to start execution.
This mean, the time taken by the datasource config or context setting is not part of your job execution time.
Should i consider this time also to calculate total amount of time taken by spring boot job for execution
It depends on what you want to measure. If you want to measure the execution time of the whole spring boot app (from the OS point of view, the total time of running your JVM process), then yes you need to include everything.
If you want to measure the execution time of your Spring Batch job and only that, you can use a JobExecutionListener, like for example:
class ExecutionTimeJobListener implements JobExecutionListener {
private Logger logger = LoggerFactory.getLogger(ExecutionTimeJobListener.class);
private StopWatch stopWatch = new StopWatch();
#Override
public void beforeJob(JobExecution jobExecution) {
stopWatch.start();
}
#Override
public void afterJob(JobExecution jobExecution) {
stopWatch.stop();
logger.info("Job took " + stopWatch.getTotalTimeSeconds() + "s");
}
}

Allow the end user to schedule task with Spring Boot

I'm using Spring Boot and I want to allow the end user to schedule task as he wants.
I have a Spring Boot REST backend with a Angular frontend. The user should be able to schedule (with a crontab style syntax) a task (i.e. a class' method backend side), and choose some arguments.
The user should alse be able to view, edit, delete scheduled task from the front end
I know I can use #Scheduled annotation, but I don't see how the end user can schedule task with it. I also take a look at Quartz, but I don't see how to set the user's argument in my method call.
Should I use Spring Batch?
To schedule job programatically, you have the following options:
For simple cases you can have a look at ScheduledExecutorService, which can schedule commands to run after a given delay, or to execute periodically. It's in package java.util.concurrent and easy to use.
To schedule Crontab job dynamically, you can use quartz and here's official examples, basically what you'll do is:
Create instance Scheduler, which can be defined as a java bean and autowire by Spring, example code:
Create JobDetail
Create CronTrigger
Schedule the job with cron
Official example code(scheduled to run every 20 seconds):
SchedulerFactory sf = new StdSchedulerFactory();
Scheduler sched = sf.getScheduler();
JobDetail job = newJob(SimpleJob.class)
.withIdentity("job1", "group1")
.build();
CronTrigger trigger = newTrigger()
.withIdentity("trigger1", "group1")
.withSchedule(cronSchedule("0/20 * * * * ?"))
.build();
sched.scheduleJob(job, trigger);

Spring batch A job instance already exists

OK,
I know this has been asked before, but I still can't find a definite answer to my question. And my question is this: I am using spring batch to export data to SOLR search server. It needs to run every minute, so I can export all the updates. The first execution passes OK, but the second one complains with:
2014-10-02 20:37:00,022 [defaultTaskScheduler-1] ERROR: catching
org.springframework.batch.core.repository.JobInstanceAlreadyCompleteException: A job instance already exists and is complete for parameters={catalogVersionPK=3378876823725152,
type=UPDATE}. If you want to run this job again, change the parameters.
at org.springframework.batch.core.repository.support.SimpleJobRepository.createJobExecution(SimpleJobRepository.java:126)
at
Of course I can add a date-time parameter to the job like this:
.addLong("time", System.getCurrentTimeMillis())
and then the job can be run more than once. However, I also want to query the last execution of the job, so I have code like this:
DateTime endTime = new DateTime(0);
JobExecution je = jobRepository.getLastJobExecution("searchExportJob", new JobParametersBuilder().addLong("catalogVersionPK", catalogVersionPK).addString("type", type).toJobParameters());
if (je != null && je.getEndTime() != null) {
endTime = new DateTime(je.getEndTime());
}
and this returns nothing, because I didn't provide the time parameter. So seems like I can run the job once and get the last execution time, or i can run it multiple times and not get the last execution time. I am really stuck :(
Assumption
Spring Batch use some tables to store each JOB executed with its parameters.
If you run twice the job with the same parameters, the second one fails, because the job is identified by jobName and parameters.
1# Solution
You could use JobExecution when run a new Job.
JobExecution execution = jobLauncher.run(job, new JobParameters());
.....
// Use a JobExecutionDao to retrieve the JobExecution by ID
JobExecution ex = jobExecutionDao.getJobExecution(execution.getId());
2# Solution
You could implement a custom JobExecutionDao and perform a custom query to find your JobExecution on BATCH_JOB_EXECUTION table.
See here the reference of Spring.
I hope my answer is helpful to you.
Use the Job Explorer as suggested by Luca Basso Ricci.
Because you do not know the job parameters you need to look up by instance.
Look for the last instance of job named searchExportJob
Look for the last execution of the instance above
This way you use Spring Batch API only
//We can set count 1 because job instance are ordered by instance id descending
//so we know the first one returned is the last instance
List<JobInstance> instances = jobExplorer.getJobInstances("searchExportJob",0,1);
JobInstance lastInstance = instances.get(0);
List<JobExecution> jobExecutions = jobExplorer.getJobExecutions(lastInstance);
//JobExcectuin is ordered by execution id descending so first
//result is the last execution
JobExecution je = jobExecutions.get(0);
if (je != null && je.getEndTime() != null) {
endTime = new DateTime(je.getEndTime());
}
Note this code only work for Spring Batch 2.2.x and above in 2.1.x the API was somewhat different
There is another interface you can use: JobExplorer
From its javadoc:
Entry point for browsing executions of running or historical jobs and
steps. Since the data may be re-hydrated from persistent storage, it
may not contain volatile fields that would have been present when the
execution was active
If you are debugging your batch job and terminate your batch job before completing, then it will give this error if you try to start it again.
To start it again either you need to update the name of your job so that it will create another execution id.
or you can update below tables.
BATCH_JOB_EXECUTION
BATCH_STEP_EXECUTION
You need to update Status and End_time columns with non null values.
Create new Job RunId everytime.
If your code involves creating Job object using Jobfactory then below snippet would be useful for the problem:
return jobBuilderFactory
.get("someJobName")
.incrementer(new RunIdIncrementer()) // solution lies here- creating new job id everytime
.flow( // and here
stepBuilderFactory
.get("someTaskletStepName")
.tasklet(tasklet) // u can replace it with step
.allowStartIfComplete(true) // this will make the job run even if complete in last run
.build())
.end()
.build();

Resources