scheduling jobs using spring batch or just Quartz scheduler - spring

I am looking for best solution to create a java web application to generate reports in excel/PDf format. some thing similar to Google Adwords, where user can create schedule reports and download it when the report is generated at a later time.
I am thinking to develop and java application where User logs, selects a pre defined report and provides the input parameters (like report date etc), This request will be queued up or saved as Quarts Job(prefer persistent Queue). A Job will be monitoring the queue/job and execute the job, generate the report(output excel /pdf) and stored in disk.
When the user refresh the screen or logs back at a later time, the report should be available for down load.
Using Spring batch and Quartz scheduler can I do this ? I also expecting like Spring admin , where I can see number of request in Queue(jobs queued up), and stop the queue processing etc.

You would use spring-batch if you wanted to process all report requests at the same time, perhaps at night when your servers are not otherwise occupied processing real-time user requests (or even during the day during slow periods).
You would use a quartz job if you wanted to check for new jobs every few seconds/minutes/hours/etc, and process one/many of them at that specified time interval.
So, quartz is a scheduler and batch is a process. You could use quartz to schedule batch jobs to run at specific times. They aren't competing technologies, they are complimentary.
About your question:
Given that you talk about queues and their persistence however it sounds a lot like your problem would fit into a simple jms model. You would need some messaging software. If you want to make it easy on yourself I'd recommend using spring-jms as a wrapper around the basic Java EE JMS api -- the spring wrappers are simply simpler than basic jms. For a messaging service I'd look at RabbitMQ, because again it's pretty simple.
With the jms architecture you'd post user requests to the queue, which you'd configured to be persistent. You'd have a custom listener on the queue, passing requests to a report generator whenever it runs. You can assign one or more threads to the listener, meaning that you should find it easy to tune the performance of the report generator.
There is a pretty useful DZone article about using rabbitmq via spring-integration (a set of prebuilt pattern implementations that help with connecting things to each other).

Related

Using Spring or Lambda for bulk event trigger

Looking for some help on an application design. I am using spring framework and hosting application in AWS.
I am working on an enterprise Java Web application that is suppose to handle events when their trigger time is reached. For example, consumers can set an event to begin on 12/20/22 at 07:35 AM, and system is suppose to send a notification when that time is reached.
I can store these events in a database along with their trigger time and setup a Spring scheduler (#Scheduler) to run every minute and process events whose trigger time is reached. My only concern with this approach is, there could be hundreds/thousands of event to trigger at any minute, and it cannot be processed within one minute.
Is there any alternate way to design this? I don't know if Spring offers a feature where I could create these Event, and Frameworks trigger these events when trigger time is reached. In that way, I can stay away from managing Scheduling and Triggering part.
I am using AWS to host this applications, so another option I'm thinking towards is creating an AWS lambda for every such Event, and let AWS manage the triggering part. In that way, I can stay away from managing the triggers.
Let me know your views? Or If you came across similar problems and how you resolved that?
You can consider using spring-cloud-dataflow to manage this as tasks and streams.
You create a custom batch application that will use #Scheduled to check the your database when events are dure and then send events to a stream. You can use Spring Integration APIs to interact with RabbitMQ or Kafka topics.
The event should contain enough information needed to process the event.
You then have a stream application that produces the content and send via email or pass it on to a separate stream app that sends the email.
https://dataflow.spring.io/docs/stream-developer-guides/programming-models/
The flow will look something like:
:mail_events | message-processor | message-sender
You will configure property for mail_events to match the topic created and configured for you mail-event-batch application.
You can use Spring Cloud Data Flow to manage the mail-event-batch application as well.
You can scale each application https://dataflow.spring.io/docs/recipes/scaling/

Scheduling task at some specific time in Java

I have some code execution which will scheduled many jobs at different date-time. So overall I will have lot of jobs to run at specific date-time. I know that there is Spring Scheduler which will execute a job at some time period, but it does not schedule a job dynamically. I can use ActiveMQ with timed delivery or Quartz for my purpose but looking for a little suggestion. Shall I use Quartz or ActiveMQ timed/delayed delivery or something else.
There is another alternative as well in Executor service with timed execution, but if application restarts then the job will be gone I believe. Any help will be appreciated.
While you can schedule message delivery in ActiveMQ it wasn't designed to be used as a job scheduler whereas that's exactly what Quartz was designed for.
In one of your comments you talked about wanting a "scalable solution" and ActiveMQ won't scale well with a huge number of scheduled jobs because the more messages which accumulate in the queues the worse it will perform since it will ultimately have to page those messages to disk rather than keeping them in memory. ActiveMQ, like most message brokers, was meant to hold messages for a relatively short amount of time before they are consumed. It's much different than a database which is better suited for this use-case. Quartz should scale better than ActiveMQ for a large number of jobs for this reason.
Also, the complexity of the jobs you can configure in Quartz is greater. If you go with ActiveMQ and you eventually need more functionality than it supports then that complexity will be pushed down into your application code. However, there's a fair chance could simply do what you want with Quartz since it was designed as a job scheduler.
Lastly, a database is more straight-forward to maintain than a message broker in my opinion and a database is also easy to provision in most cloud providers. I'd recommend you go with Quartz.
You can start by using a cron-expression in order to cover the case when your application will restart. The cron-expression can be stored in the properties file. Also, when your application will be scheduled, you can restart or reschedule your job programatically by creating a new job instance with another cron-expression for example.

Spring Batch Flow Job - control number of jobs running simultaneously

I have a complex long-running flow that I'm going to implement based on Spring Batch Flow Job.
My REST API will wait for the incoming request and then (based on each request) initiate a new job execution.
Right now I'm worried about the server resources because the number of incoming requests is a quite big and I'd like to control the number of jobs running simultaneously. Is there any way to tell Spring Batch to run simultaneously not more than the exact number of jobs(let's say 5) and put rest of the jobs into the queue in order to be executed later, when for example one of these previous 5 jobs will be finished?
There is not a way to accomplish this in Spring Batch. The reason for this is that the number of concurrent jobs is really an orchestration problem which Spring Batch specifically avoids solving (allowing you to integrate with whatever you want).
That being said, the ability to control what you're describing can be done in a relatively straight forward manor by implementing a work queue that stores the requests to run a job, and having a service picking up those requests at the other end. The concurrency can be controlled easily with Spring Integration components to prevent the system from being overloaded (assuming you have a mechanism to handle the queue size in question).

Sending scheduled emails in a spring application?

I need to achieve the following : -
Sending emails to around 6000 users around 30 times in a year. Sometimes sending emails at specific time of day else at midnight.
I need to provide retry functionality in my application, so if by some reason my application failed to send email to some of the user it should retry to send 3 times (till 3 days) before finally marking it as failure.
i need to send emails using predefined templates but having dynamic data in it.
My application tech stack - java, spring boot 1.4, oracle database, CA autosys job scheduler, activiti bpm (not using Activiti as of now but can use it if it is the best solution)
My current solution :-
Use autosys scheduler to define these jobs.
calling my Rest exposed services (spring + java + oracle tech stack), that perform all the application logic and them Apache commons email to send the email using my smtp server.
My question - What is the recommended way to send email in this case? As i have to maintain various tables to achieve retry functionality. should i use activiti instead of autosys scheduler? Or spring framework itself for this email scheduling?
I don't see any business processes to be managed in your problem. as far as no business people are involved in any task (such as filling a form, make a decision based on input provided), you should avoid activiti. Activiti is a BPM engine, there is no use of it unless you are managing a process. for schedulers you should definitely go ahead with the spring framework. Do let me know if i've missed any point.

Spring Batch or JMS for long running jobs

I have the problem that I have to run very long running processes on my Webservice and now I'm looking for a good way to handle the result. The scenario : A user executes such a long running process via UI. Now he gets the message that his request was accepted and that he should return some time later. So there's no need to display him the status of his request or something like this. I'm just looking for a way to handle the result of the long running process properly. Since the processes are external programms, my application server is not aware of them. Therefore I have to wait for these programms to terminate. Of course I don't want to use EJBs for this because then they would block for the time no result is available. Instead I thought of using JMS or Spring Batch. Does anyone ever had the same problem or an advice which solution would be better?
It really depends on what forms of communication your external programs have available. JMS is a very good approach and immediately available in your app server but might not be the best option if your external program is a long running DB query which dumps the result in a text file...
The main advantage of Spring Batch over "just" using JMS as an aynchronous communcations channel is the transactional properties, allowing the infrastructure to retry failed jobs, group jobs together and such. Without knowing more about your specific setup, it is hard to give detailed advise.
Cheers,
I had a similar design requirement, users were sending XML files and I had to generate documents from them. Using JMS in this case is advantageous since you can always add new instances of these processes which can consume and execute the jobs in parallel.
You can use a timer task to check status or monitor these processes. Also, you can publish a message to a JMS queue once the processes are completed.

Resources