Can we run Spring Batch multiple times a day? - spring

I have implemented Spring batch before couple of times before but it was designed to run only once in a day.
Now I have a new requirement where I need to start the batch whenever a record gets inserted into the table. whenver a new record is inserted, it will launch the job and batch will generate PDF and save it in repository and send mail to user.
I am not sure how to design a spring batch which runs multiple times a day or is it even correct to go for Spring batch for this scenario. Can someone please throw some light on this. Thanks !!!

You can implement a listener to catch when data are stored in db (easily with hibernate for instance) and then use CommandLineJobRunner to start your job manually.
See spring_source
You can run it several times, just be careful with identifier pattern use for your batch instance

As per your requirement, you can achieve this with the help of #EntityListeners (if working hiberante).
let me give you dummy scenario :-
#Entity
#Table(name="Order")
#EntityListeners(OrderListner.class)
public class Order{
#Id
public Integer id;
// other properties
}
This Listener :-
class OrderListner{
#PostPersist
public void doStartSchedulerCode(){
// You can call the code from here responobile for generating pdf and send mail,
}
}
Each time you will insert a row in order table,doStartSchedulerCode() will be called.
Try this

Related

Spring Boot Caching auto refresh using #PostConstruct

I currently have a Spring Boot based application where there is no active cache. Our application is heavily dependent on key-value configurations which we maintain in an Oracle DB. Currently, without cache, each time I want to get any value from that table, it is a database call. This is, expectedly causing a lot of overhead due to high number of transactions to the DB. Hence, the need for cache arrived.
On searching for caching solutions for SpringBoot, I mostly found links where we are caching object while any CRUD operation is performed via the application code itself, using annotations like #Cacheable, #CachePut, #CacheEvict, etc. but this is not applicable for me. I have a master data of key-value pairs in the DB, any change needs approvals and hence the access is not directly provided to the user, it is made once approved directly in the DB.
I want to have these said key-values to be loaded at startup time and kept in the memory, so I tried to implement the same using #PostConstruct and ConcurrentHashMap class, something like this:
public ConcurrentHashMap<String, String> cacheMap = new ConcurrentHashMap<>();
#PostConstruct
public void initialiseCacheMap() {
List<MyEntity> list = myRepository.findAll();
for(int i = 0; i < list.size(); i++) {
cacheMap.put(list.get(i).getKey(), list.get(i).getValue());
}
}
In my service class, whenever I want to get something, I am first checking if the data is available in the map, if not I am checking the DB.
My purpose is getting fulfilled and I am able to drastically improve the performance of the application. A certain set of transactions were earlier taking 6.28 seconds to complete, which are now completed in mere 562 milliseconds! however, there is just one problem which I am not able to figure out:
#PostConstruct is called once by Spring, on startup, post dependency injection. Which means, I have no means to re-trigger the cache build without restart or application downtime, this is not acceptable unfortunately. Further, as of now, I do not have the liberty to use any existing caching frameworks or libraries like ehcache or Redis.
How can I achieve periodic refreshing of this cache (let's say every 30 minutes?) with only plain old Java/Spring classes/libraries?
Thanks in advance for any ideas!
You can do this several ways, but how you can also achieve this is by doing something in the direction of:
private const val everyThrityMinute = "0 0/30 * * * ?"
#Component
class TheAmazingPreloader {
#Scheduled(cron = everyThrityMinute)
#EventListener(ApplicationReadyEvent::class)
fun refreshCachedEntries() {
// the preloading happens here
}
}
Then you have the preloading bits when the application has started, and also the refreshing mechanism in place that triggers, say, every 30 minutes.
You will require to add the annotation on some #Configuration-class or the #SpringBootApplication-class:
#EnableScheduling

Nested transaction in SpringBatch tasklet not working

I'm using SpringBatch for my app. In one of the batch jobs, I need to process multiple data. Each data requires several database updates. And I need to make one transaction for one data. Meaning, if when processing one data an exception is thrown, database updates are rolled back for that data, then keep processing the next data.
I've put all database updates in one method in service layer. In my springbatch tasklet, I call that method for each data, like this;
for (RequestViewForBatch request : requestList) {
orderService.processEachRequest(request);
}
In the service class the method is like this;
Transactional(propagation = Propagation.NESTED, timeout = 100, rollbackFor = Exception.class)
public void processEachRequest(RequestViewForBatch request) {
//update database
}
When executing the task, it gives me this error message
org.springframework.transaction.NestedTransactionNotSupportedException: Transaction manager does not allow nested transactions by default - specify 'nestedTransactionAllowed' property with value 'true'
but i don't know how to solve this error.
Any suggestion would be appreciated. Thanks in advance.
The tasklet step will be executed in a transaction driven by Spring Batch. You need to remove the #Transactional on your processEachRequest method.
You would need a fault-tolerant chunk-oriented step configured with a skip policy. In this case, only faulty items will be skipped. Please refer to the Configuring Skip Logic section of the documentation. You can find an example here.

How to lock the job during execution in Laravel?

I see withoutOverlapping() mutex for commands, but I don't see it for jobs. How can I protect jobs of the same type from overlapping each other?
Thanks!
I think it's possible using the following:
https://laravel.com/docs/8.x/queues#unique-jobs
You can specify a needed key that you can pass to the job to mark its uniqueness. In my case, I need to limit requests to a third-party API that happens in the job so if I have more than one worker handling the queue, it's possible to get 429 from the API. As soon as I have many API-keys (per user of the app), I can use it to have the same type of job being exxecuted independently across the app users but lock the job execution if the current job with a specific key is not completed.
Like this:
//In the class defining you must use ShouldBeUnique interface
class UpdateSpreadsheet implements ShouldQueue, ShouldBeUnique
//some other code
public function __construct($keyValue)
{
//some other constructor code if needed
$this->keyValue= $keyValue;
}
//This function allows to set the unique key
public function uniqueId()
{
return $this->keyValue;
}
//If you don't need to wait until the job is processed, you may also specify
//the time for the force lock removing (so you'll be able to queue another
//job with this key after 10 seconds even if the current job is
//still in process)
public $uniqueFor = 10;

Is it possible to lock some entries in MongoDB and do a query that do not take into account the locked recors?

I have a mongoDB that contains a list of "task" and two istance of executors. This 2 executors have to read a task from the DB, save it in the state "IN_EXECUTION" and execute the task. Of course I do not want that my 2 executors execute the same task and this is my problem.
I use the transaction query. In this way when An executor try to change state of the task it get "write exception" and have to start again and read a new task. The problem of this approach is that sometimes an Executor get a lot of errors before it can save the change of task state correctly and execute a new task. So it is like I have only one exector.
Note:
- I do not want to block my entire DB on read/write becouse in this way I will slow down the entire process.
- I think it is necessay to save the state of the task because it could be a long task.
I asked if it is possible to lock only certain record and execute a query on the "not-locked" records but each advices that solves my problem will be really appriciated.
Thanks in advance.
EDIT1:
Sorry, I simplified the concept in the question above. Actually I extract n messages that I have to send. I have to send this messages in block of 100 messages so my executors will split the messages extracted in block of 100 and pass them to others executors basically.
Each executor extract the messages and then update them with the new state. I hope this is more clear now.
#Transactional(readOnly = false, propagation = Propagation.REQUIRED)
public List<PushMessageDB> assignPendingMessages(int limitQuery, boolean sortByClientPriority,
LocalDateTime now, String senderId) {
final List<PushMessageDB> messages = repositoryMessage.findByNotSendendAndSpecificError(limitQuery, sortByClientPriority, now);
long count = repositoryMessage.updateStateAndSenderId(messages, senderId, MessageState.IN_EXECUTION);
return messages;
}
DB update:
public long updateStateAndSenderId(List<String> ids, String senderId, MessageState messageState) {
Query query = new Query(Criteria.where(INTERNAL_ID).in(ids));
Update update = new Update().set(MESSAGE_STATE, messageState).set(SENDER_ID, senderId);
return mongoTemplate.updateMulti(query, update, PushMessageDB.class).getModifiedCount();
}
You will have to do the locking one-by-one.
Trying to lock 100 records at once and at the same time have a second process also lock 100 records (without any coordination between the two) will almost certainly result in an overlapping set unless you have a huge selection of available records.
Depending on your application, having all work done by one thread (and the other being just a "hot standby") may also be acceptable as long as that single worker does not get overloaded.

Quartz .NET - Prevent parallel Job Execution

I am using Quartz .NET for job scheduling.
So I created one job class (implementing IJob).
public class TransferData : IJob
{
public Task Execute(IJobExecutionContext context){
string tableName = context.JobDetail.JobDataMap.Get("table");
// Transfer the table here.
}
}
So I want to transfer different and multiple tables. For this purpose I am doing something like this:
foreach (Table table in tables)
{
IJobDetail job = JobBuilder.Create<TransferData>()
.WithIdentity(new JobKey(table.Name, "table_transfer"))
.UsingJobData("table", table.Name)
.Build();
ITrigger trigger = TriggerBuilder.Create()
.WithIdentity(new TriggerKey("trigger_" + table.Name, "table_trigger"))
.WithCronSchedule("*/5 * * * *")
.ForJob(job)
.Build();
await this.scheduler.ScheduleJob(job, trigger);
}
So every table should be transfered every 5 minutes. To achieve this I create several jobs with different job names.
The question is: how to prevent the parallel job execution for the same jobName? (e.g. the previous run takes longer for one table, so I do not want to start the next transfer for the same table.)
I know about the attribute #DisallowConcurrentExecution, but this is used to prevent the parallel execution for the same Job class. I do not want to write an extra Job class per table, because the "main" code for the transfer is always the same, the one and only difference is the table name. So I want to use the same job class for this purpose.
The Quatz .NET documentation is a little bit confusing.
DisallowConcurrentExecution is an attribute that can be added to the
Job class that tells Quartz not to execute multiple instances of a
given job definition (that refers to the given job class)
concurrently. Notice the wording there, as it was chosen very
carefully. In the example from the previous section, if
“SalesReportJob” has this attribute, than only one instance of
“SalesReportForJoe” can execute at a given time, but it can execute
concurrently with an instance of “SalesReportForMike”. The constraint
is based upon an instance definition (JobDetail), not on instances of
the job class. However, it was decided (during the design of Quartz)
to have the attribute carried on the class itself, because it does
often make a difference to how the class is coded.
Source: https://www.quartz-scheduler.net/documentation/quartz-3.x/tutorial/more-about-jobs.html
But if you read the API documentation, it's says: the bold text is important!
An attribute that marks a IJob class as one that must not have
multiple instances executed concurrently (where instance is based-upon
a IJobDetail definition - or in other words based upon a JobKey).
Source: https://quartznet.sourceforge.io/apidoc/3.0/html/
In other words: the DisallowConcurrentExecution attribute works for my purposes.

Resources