spring batch vs quartz jobs? - spring

I am new to batch processing. I am trying to start with simple scheduler and job. But i am confused b/w
spring batch vs quartz jobs. My understanding is
Quartz :- quartz provides both frameworks i.e scheduler framework and job framework(in case I do not want to use spring batch jobs). Right ?
Spring Batch :- It only provides the job framework . I have always send using Quatz schecduler to schedule spring batch jobs.
Does spring provides its own scheduler also ?

Quartz is a scheduling framework. Like "execute something every hour or every last Friday of the month"
Spring Batch is a framework that defines that "something" that will be executed.
You can define a job, that consists of steps. Usually, a step is something that consists of an item reader, an optional item processor, and an item writer, but you can define a custom step. You can also tell Spring Batch to commit every 10 items and a lot of other stuff.
You can use Quartz to start Spring Batch jobs.
So basically Spring Batch defines what should be done, and Quartz defines when it should be done.

There is answer for this question in official FAQ
How does Spring Batch differ from Quartz?
Is there a place for them both in a solution?
Spring Batch and Quartz have different goals. Spring Batch provides functionality for processing large volumes of data and Quartz provides functionality for scheduling tasks. So Quartz could complement Spring Batch, but are not excluding technologies. A common combination would be to use Quartz as a trigger for a Spring Batch job using a Cron expression and the Spring Core convenience SchedulerFactoryBean.

Does spring provides its own scheduler also?
Yes, using Spring TaskScheduler as follows:
<task:scheduled-tasks>
<task:scheduled ref="runScheduler" method="run" fixed-delay="5000" />
</task:scheduled-tasks>
<task:scheduled-tasks>
<task:scheduled ref="runScheduler" method="run" cron="*/5 * * * * *" />
</task:scheduled-tasks>
full example
With Quartz Scheduler as follows:
<!-- run every 10 seconds -->
<bean class="org.springframework.scheduling.quartz.SchedulerFactoryBean">
<property name="triggers">
<bean id="cronTrigger" class="org.springframework.scheduling.quartz.CronTriggerBean">
<property name="jobDetail" ref="jobDetail" />
<property name="cronExpression" value="*/10 * * * * ?" />
</bean>
</property>
</bean>
full example

Spring Batch: reads data from a datasource (table in a database, flat file, etc), processes that data. Then stores the data in another datasource and may be in another format.
I have made a tutorial in my blog on how to integrate Spring Boot 2, Spring batch and Quartz.
You can integrate Spring boot and spring batch and skip the Quartz integration.
Quartz is a scheduler that schedules a task in the future and it has its own metadata tables to manage the state of the jobs.

Related

Run batch only on one server

I have a Spring boot MVC and batch application. Both the batch and MVC share the DAO and Service layers so they are in the same war file. They are deployed into 4 cloud servers and there is a load balance and vip configured for the UI application. So the MVC application is fine.
The problem is as part of the batch i do FTP of a file to an external server and that external server FTPs the processed file back. The processed file comes back only to one among the 4 servers. So I want the batch to run only on 1 server. How do i suppress the batch from executing in the other servers.
Solution becomes easier as your 4 instances are running on 4 different cloud severs. The starting point of the batch can be a file poller. So if the file is dropped into the polled directory on server 1, the batch job on server 1 will be invoked. The other instances do nothing as there is no file dropped on that server.
You need to integrate file poller before spring batch. Something like this - http://docs.spring.io/spring-batch/reference/html/springBatchIntegration.html
<int:channel id="inboundFileChannel"/>
<int:channel id="outboundJobRequestChannel"/>
<int:channel id="jobLaunchReplyChannel"/>
<int-file:inbound-channel-adapter id="filePoller"
channel="inboundFileChannel"
directory="file:/tmp/myfiles/"
filename-pattern="*.csv">
<int:poller fixed-rate="1000"/>
</int-file:inbound-channel-adapter>
<int:transformer input-channel="inboundFileChannel"
output-channel="outboundJobRequestChannel">. <bean class="io.spring.sbi.FileMessageToJobRequest">
<property name="job" ref="personJob"/>
<property name="fileParameterName" value="input.file.name"/>
</bean>
</int:transformer>
<batch-int:job-launching-gateway request-channel="outboundJobRequestChannel"
reply-channel="jobLaunchReplyChannel"/>
This can be one of many approaches but a way to achieve is keep a value in property file and set it's value to Boolean true
Now handle your batch to run only if property file value is true.
This way it gives you flexibility to change the server you want to handle batch job.

Spring+quartz should I make my own bean?

I need a job that can run every 1 minutes betwen 17h and 18h, it should not be relaunched if the job is unfinished.
The org.springframework.scheduling.quartz.CronTriggerBean seems to be what I need but I found nothing about concurrency.
Would you know a quartz bean which would fit my needs?
Every javadoc I found has almost all its link broken.
http://docs.spring.io/spring/docs/3.1.x/javadoc-api/org/springframework/scheduling/quartz/CronTriggerBean.html
Or will I have to make my own kind of bean?
quartz is in 1.8.5 and spring in 2.5.6
Thanks.
-Sure, the CronTriggerBean is suitable for your case. The expression you need is 0 * 17 * * ? and will run for every minute starting at 17.00 with the last trigger happening at 17.59.
-In order to disable concurrency, in newer versions you can put #DisallowConcurrentExecution over your job class. In 1.8 version I think that annotation is not supported, and instead you need to put "implements StatefulJob" in your job class so that it implements StatefulJob that can be run only by one thread a time
-a sample app using quartz 1.8 can be found at http://www.mkyong.com/spring/spring-quartz-scheduler-example/
The 2.5 JavaDoc can be found here.
In Spring 2.5 you can set a concurrent attribute in the XML when using MethodInvokingJobDetailFactoryBean. Setting it prevents multiple instances from running at the same time, but it should be noted that triggers will be queued up and launched when the previous instance of the job finishes.
Here is a sample:
<bean id="fooJob" class="org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean">
<property name="targetObject" ref="fooManager" />
<property name="targetMethod" value="myJOb" />
<property name="concurrent" value="true"/>
</bean>

Spring scheduler tasks causes memory leak in tomcat

I have used spring scheduler to run methods using a cron timer as shown below . The application has atleast 50 scheduler beans of the same class mentioned in bold below. We create new beans by passing configuration parameters through an xml given in the property section. But we get an error from tomcat 6.0.36 which is shown as italics text below. Is this an issue , is there any way to overcome this error. If we add a lot of scheduled tasks as given below , will this not affect the application performance?
SEVERE: The web application [/App ] created a ThreadLocal with key of type [java.lang.ThreadLocal] (value [java.lang.ThreadLocal#757fad]) and a value of type [org.mozilla.javascript.Context[]] (value [[Lorg.mozilla.javascript.Context;#18e915a]) but failed to remove it when the web application was stopped. This is very likely to create a memory leak.
<task:scheduled-tasks scheduler="myScheduler">
<task:scheduled ref="taskSchedulerClass" method="callScheduler" cron="0 0/4 * * * *"/> </task:scheduled-tasks>
<task:scheduler id="myScheduler" pool-size="10"/>
**<bean id="taskSchedulerClass" class="com.abc.efg.util.xyz">**
<property name="xmlName" value="xyz.xml" /> </bean>
Rhino's context clean up has been improved only for tomcat 7 : https://issues.apache.org/bugzilla/show_bug.cgi?id=49159 . So you will still get in tomcat6. Your error does not seem related to your scheduler.

quartz jboss spring multiple webapps

I have been successfully using quartz in my application.
Basically I have quartz bundled inside the webapp1 which is running inside the Jboss.
But we have got another webapp2 running in the jboss which needs to have quartz job as well
Now what I need to do is to have quartz scheduler running in the jboss as some kind of service and both the webapps should be able
to register their jobs on the single quartz scheduler.
below is my related spring configuration for webapp1 which has beenworking till now.
<bean id="qtzScheduler"
class="org.springframework.scheduling.quartz.SchedulerFactoryBean">
<property name="dataSource">
<ref bean="jndiDataSource" />
</property>
<property name="applicationContextSchedulerContextKey">
<value>applicationContext</value>
</property>
<property name="transactionManager">
<ref bean="transactionManager" />
</property>
<property name="schedulerName" value="webapp1" />
</bean>
<bean id="wrapperScheduler" class="uk.fa.quartz.schedule.ServiceScheduler">
<property name="scheduler">
<ref bean="qtzScheduler" />
</property>
</bean>
<bean id="jndiDataSource" class="org.springframework.jndi.JndiObjectFactoryBean">
<property name="jndiName">
<value>java:/FmManagerDS</value>
</property>
</bean>
when I have to schedule the job,code is like below:
WrapperScheduler scheduler = (WrapperScheduler) ctx.getBean("wrapperScheduler");
scheduler.scheduleCronJob(job, jobName + "CronTrigger", WrapperScheduler.TRIGGER_GROUP, cronExpression);
Now I dont want to define the same scheduler again in webapp2 which will cause 2 quartz scheduler running in the jboss.
Can someone has any idea how to do it ?
I saw one example on the internet like below Link which I think is doing what I want.
But I dont understand how I can integrate this with my system using the datasource defined in my spring source.
If anybody can share the configuration or point me to the right resource on internet,I would be highly thankful.
The link you refer to explains how to access Quartz scheduler services built into JBoss. I have never used such approach but basically you let JBoss handle your scheduler, data source and everything around it. This makes it very easy to take advantage of job scheduling without all the configuration hustle - but is not very flexible and your application is no longer self-contained.
In your case I see two options worth investigating:
Clustered Quartz scheduler
Configure both of your web applications to run in a cluster. Both applications will share the same database and will run jobs defined in each other. This might not be an option for you due to several reasons:
both applications must be able to run jobs defined by each other - e.g. job classes must be available on CLASSPATH
you still need to define Quartz configuration in both applications (you can easily share configuration though, e.g. by extracting the XML configuration into a separate file)
both applications will maintain separate thread pool
Clustering is more suited for homogeneous applications running on several machines, not heterogeneous ones on single node.
Remote scheduler
Quartz has a built-in support for remote schedulers via rmi. Basically one application hosts full-blown Quartz server whilst the other connects to that server. This seems like a better approach for you (let's call it "master-slave") as only one application manages the scheduler while the other uses existing one.
See: RemoteScheduler.
Finally I got time to write it all down for all others who might have to use Quartz as a service running in jboss.
But other options as mentioned in his answer by #Tomasz can also be tried.
Please note that you will get a null reference back if you try to retrieve it from outside the JBoss server in which it was bound. In case you have such a requirement you might want to consider using Quartz's RMI support instead.
1)First of all ensure that you remove any existing versions of the quartz in the jboss/[profile]/lib or the quartz.rar which is comes up with the jboss distribution.
2)Please your quartz.1.8.3.jar & quartz-jboss.1.8.jar into the acccesmanager/[profile]/lib
3)Below is the code for quartz-service.xml which needs to be placed in the jboss deploy folder which will start the Quartz scheduler:
<server>
<mbean code="org.quartz.ee.jmx.jboss.QuartzService"
name="user:service=QuartzService,name=QuartzService">
<attribute name="JndiName">Quartz</attribute>
<attribute name="Properties">
org.quartz.scheduler.instanceName = BGSScheduler
org.quartz.scheduler.rmi.export = false
org.quartz.scheduler.rmi.proxy = false
org.quartz.scheduler.xaTransacted = false
org.quartz.threadPool.class = org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount = 5
org.quartz.threadPool.threadPriority = 4
org.quartz.scheduler.threadsInheritContextClassLoaderOfInitializer = true
org.quartz.threadPool.threadsInheritContextClassLoaderOfInitializingThread = true
org.quartz.jobStore.class = org.quartz.impl.jdbcjobstore.JobStoreCMT
org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.dataSource = QUARTZ
org.quartz.dataSource.QUARTZ.jndiURL = java:FmManagerDS
org.quartz.jobStore.nonManagedTXDataSource = QUARTZ_NO_TX
org.quartz.dataSource.QUARTZ_NO_TX.jndiURL = java:FmManagerDS
</attribute>
<depends>jboss.jca:service=DataSourceBinding,name=FmManagerDS</depends>
</mbean>
</server>
]
Most of things are self explanatory or you can get more details on this at Quartz Configuration
The key thing is to note that quartz requires 2 datasources.One is the container managed datasource-same one as one defined in jboss *-ds.xml(java:FmManagerDS in my case).
If your 'org.quartz.jobStore.dataSource' is XA, then set 'org.quartz.jobStore.nonManagedTXDataSource' to a non-XA datasource (for the same DB). Otherwise, you can set them to be the same.
Then in spring applicationContext,I had to get the handle of quartz so that I could inject into the wrapperScheduler.Code for that is below
<bean id="quartzScheduler" class="org.springframework.jndi.JndiObjectFactoryBean">
<property name="jndiName">
<value>Quartz</value>
</property>
</bean>
<bean id="wrapperScheduler" class="k.fa.quartz.schedule.ServiceScheduler">
<property name="scheduler">
<ref bean="quartzScheduler" />
</property>
</bean>
Then we can schedule the jobs using below
Timestamp t = new Timestamp (System.currentTimeMillis());
ScheduleJob job = new ScheduleJob(EmailJob.class.getCanonicalName() +t.toString(), EmailJob.class);
Below is the code to pass the spring applicationContext to the EmailJob so that we can access beans and other things.
To achieve that we need to implement the ApplicationContextAware interface so that applicationContext is available and same is then
pushed into the schedulerContext.Please ensure that we dont put applicationContext into the JobdataMap incase you are using JDBC store as it gives serialization issues.
serviceScheduler.getScheduler().getContext().put("applicationContext", ctx);
serviceScheduler.scheduleCronJob(job, "test" + t.toString(), ServiceScheduler.DEFAULT_TRIGGER_GROUP, cronExpression);
Others who are not using wrapperscheduler can similarly get the handle of the quartz directly into their code using below
InitialContext ctx = new InitialContext();
Scheduler scheduler = (Scheduler) ctx.lookup("Quartz");
ScheduleJob job = new ScheduleJob(EmailJob.class.getCanonicalName() +t.toString(), Executor.class);
scheduler.scheduleJob(job, trigger);
In the Email job class you can use below to get the applicationContext
applicationContext = (ApplicationContext) context.getScheduler().getContext().get(APPLICATION_CONTEXT_KEY);
//get spring bean and do the necessary stuff
Another important thing is that as the quartz schedler is running outside the webapplication,quartz wont be able to fire the jobclass if it is inside the war.It needs to be in the shared jar in the jboss/[profile]/lib.

Create a Service with Java that runs when the app is deoployed ?

I have a Web Application with J2EE and Spring, related to an Oracle 10g Database. I want to create a Service in Java that will poll statistics from the Database and send mail every 5 min. This Service should start when the application is deployed under Tomcat or Web-sphere.
Any Ideas How this could be done ??
Thanks
Since use Spring, its Time execution and scheduling classes seem a natural choice. They work both in Tomcat and Websphere, just create your task as a POJO and schedule it:
<bean id="PollingTask" class="com.sth.PollingPOJO">
<!-- properties, if any -->
</bean>
<task:scheduler id="scheduler" pool-size="1" />
<task:scheduled-tasks scheduler="scheduler">
<!-- runs every 30 minutes -->
<task:scheduled ref="PollingTask" method="run" fixed-delay="#{ 30*60*1000 }" />
</task:scheduled-tasks>
The PollingTask looks like (note that it doesn't have to implement Runnable, "run" method is just a convention):
class PollingTask() {
public void run() {
// entry point
}
}

Resources