Spring quartz/cron jobs in a distributed environment - spring

I have a fleet of about 5 servers. I want to run an identical Spring/Tomcat app on each machine.
I also need a particular task to be executed every ten minutes. It should only run on one of the machines. I need some sort of election protocol or other similar solution.
Does Spring or Quartz have any sort of built-in distributed cron solution, or do I need to implement something myself?

Hazelcast has a distributed executor framework which you can use to run jobs using the JDK Executor framework (which, by the way, is possibly more testable than horrid Quartz... maybe). It has a number of modes of operation, including having it pick a single node "at random" to execute your job on.
See the documentation for more details

Related

Deploy 2 different topologies on a single Nimbus with 2 different hardware

I have 2 sets of storm topologies in use today, one is up 24/7, and does it's own work.
The other, is deployed on demand, and handles a much bigger loads of data.
As of today, we have N supervisors instances, all from the same type of hardware (CPU/RAM), I'd like my on demand topology to run on stronger hardware, but as far as I know, there's no way to control which supervisor is assigned to which topology.
So if I can't control it, it's possible that the 24/7 topology would assign one of the stronger workers to itself.
Any ideas, if there is such a way?
Thanks in advance
Yes, you can control which topologies go where. This is the job of the scheduler.
You very likely want either the isolation scheduler or the resource aware scheduler. See https://storm.apache.org/releases/2.0.0-SNAPSHOT/Storm-Scheduler.html and https://storm.apache.org/releases/2.0.0-SNAPSHOT/Resource_Aware_Scheduler_overview.html.
The isolation scheduler lets you prevent Storm from running any other topologies on the machines you use to run the on demand topology. The resource aware scheduler would let you set the resource requirements for the on demand topology, and preferentially assign the strong machines to the on demand topology. See the priority section at https://storm.apache.org/releases/2.0.0-SNAPSHOT/Resource_Aware_Scheduler_overview.html#Topology-Priorities-and-Per-User-Resource.

Apache Mesos Schedulers and Executors by example

I am trying to understand how the various components of Mesos work together, and found this excellent tutorial that contains the following architectural overview:
I have a few concerns about this that aren't made clear (either in the article or in the official Mesos docs):
Where are the Schedulers running? Are there "Scheduler nodes" where only the Schedulers should be running?
If I was writing my own Mesos framework, what Scheduler functionality would I need to implement? Is it just a binary yes/no or accept/reject for Offers sent by the Master? Any concrete examples?
If I was writing my own Mesos framework, what Executor functionality would I need to implement? Any concrete examples?
What's a concrete example of a Task that would be sent to an Executor?
Are Executors "pinned" (permanently installed on) Slaves, or do they float around in an "on demand" type fashion, being installed and executed dynamically/on-the-fly?
Great questions!
I believe it would be really helpful to have a look at a sample framework such as Rendler. This will probably answer most of your question and give you feeling for the framework internal.
Let me now try to answer the question which might be still be open after this.
Scheduler Location
Schedulers are not on on any special nodes, but keep in mind that schedulers can failover as well (as any part in a distributed system).
Scheduler functionality
Have a look at Rendler or at the framework development guide.
Executor functionality/Task
I believe Rendler is a good example to understand the Task/Executor relationship. Just start reading the README/description on the main github page.
Executor pinning
Executors are started on each node when the first Task requiring such executor is send to this node. After this it will remain on that node.
Hope this helped!
To add to js84's excellent response,
Scheduler Location: Many users like to launch the schedulers via another framework like Marathon to ensure that if the scheduler or its node dies, then it can be restarted elsewhere.
Scheduler functionality: After registering with Mesos, your scheduler will start getting resource offers in the resourceOffers() callback, in which your scheduler should launch (at least) one task on a subset (or all) of the resources being offered. You'll probably also want to implement the statusUpdate() callback to handle task completion/failure.
Note that you may not even need to implement your own scheduler if an existing framework like Marathon/Chronos/Aurora/Kubernetes could suffice.
Executor functionality: You usually don't need to create a custom executor if you just want to launch a linux process or docker container and know when it completes. You could just use the default mesos-executor (by specifying a CommandInfo directly in TaskInfo, instead of embedded inside an ExecutorInfo). If, however you want to build a custom executor, at minimum you need to implement launchTask(), and ideally also killTask().
Example Task: An example task could be a simple linux command like sleep 1000 or echo "Hello World", or a docker container (via ContainerInfo) like image : 'mysql'. Or, if you use a custom executor, then the executor defines what a task is and how to run it, so a task could instead be run as another thread in the executor's process, or just become an item in a queue in a single-threaded executor.
Executor pinning: The executor is distributed via CommandInfo URIs, just like any task binaries, so they do not need to be preinstalled on the nodes. Mesos will fetch and run it for you.
Schedulers: are some strategy to accept or reject the offer. Schedulers we can write our own or we can use some existing one like chronos. In scheduler we should evaluate the resources available and then either accept or reject.
Scheduler functionality: Example could be like suppose say u have a task which needs 8 cpus to run, but the offer from mesos may be 6 cpus which won't serve the need in this case u can reject.
Executor functionality : Executor handles state related information of your task. Set of APIs you need to implement like what is the status of assigned task in mesos slave. What is the num of cpus currently available in mesos slave where executor is running.
concrete example for executor : chronos
being installed and executed dynamically/on-the-fly : These are not possible, you need to pre configure the executors. However you can replicate the executors using autoscaling.

Reliable scheduling with sidekiq

I'm building a monitoring service similar to pingdom but monitoring different aspects of a system and using sidekiq to queue the tasks which is working well. What I need to do is to schedule sending out pings every minute, rather than using a cron based system which would require spinning up a new ruby instance every minute I have gone down the route of using sidetiq (notice the different spelling with a "t") which uses sidekiq's own queue to schedule future tasks. This feels like a neat solution, however I am concerned this may not be the most reliable way of scheduling tasks? If there are issues with the system (as there inevitable will be at some point) will this method of scheduling tasks be less reliable than using a cron based method and why?
Thanks
You are giving too short description of your system needs but I'll try to guess how it could be:
In the first place using sidekiq means that you'll also need an instance of redis and also means that you'll need a way to monitor the sidekiq process and restart it in case of failure and possibly redis server.
A method based on cron tasks will have fewer requirements therefore much less possibilities of failing.
cron has been around for a long time and it's battle tested and it's very very reliable, but has it's drawbacks too.
Said that, you can build a system with separate instances of redis in a master/slave configuration and you can also use Redis sentinel to implement a failover in case of the master failure, implement a monitoring/alerting system on this setup (you can use something super simple like this http://contribsys.com/inspeqtor/ from the sidekiq author) and you can also start several instances of sidekiq in different machines.
With all of that, you can have a quite reliable system for running sidekiq with sidetiq.
Hope it helps

Scheduled tasks with multiple servers - single point of responsibility

We have a Spring + JPA web application.
We use two tomcat servers that run both application and uses the same DB.
One of our application requirmemnt is to preform cron \ scheduled tasks.
After a short research we found that spring framework delivers a very straight forward solution to cron jobs,
(Annotation based solution)
However since both tomcats running the same webapp - if we will use this spring's solution we will create a very problematic scenario where 2 crons are running at the same time (each on a different tomcat)
Is there any way to solve this issue? maybe this alternative is not good for our purpose?
thanks!
As a general rule, you're going to want to save a setting to indicate that a job is running. Similar to how "Spring Batch" does the trick, you might want to create a table in your database simply for storing a job execution. You can choose to implement this however you'd like, but ultimately, your scheduled tasks should check the database to see if an identical task is already running, and if not, proceed with the task execution. Once the task has completed, update the database appropriately so that a future execution will be able to proceed.
#kungfuters solution is certainly a better end goal, but as a simple first implementation, you could use a property to enable/disable the tasks, and only have the tasks run on one of the servers.

Spring Batch Parallel Job Scaling

I'm currently working on a Spring Batch POC and have got a pretty good handle on most of the actual Spring Batch features. I've currently got a program that uses Spring Integration to receive an HttpRequest and use message channels to eventually send the job executions to the job launcher in a queue. What we'd really like to do is implement some kind of "scheduler/load balancer" (not quite sure what to call it) before the job launcher that will look at the currently running worker nodes and the size of the input file and make a decision on how many worker nodes the job should be allowed. We would probably also want to be able to change the amount of worker nodes a job has while it is running to allow more jobs to run.
The idea is that we'd have a server running that could accept many job requests at any time, and a large cluster of machines that jobs will be partitioned onto. We'd like to be able to scale horizontally, so whenever the server isn't busy it can make full use of the hardware, as well as being able to make sure that small jobs don't get constantly blocked by larger jobs.
From my research it seems like we'd have to implement another framework to do this (do GridGain and Hadoop allow this?), but I figured I'd ask to see what people recommended to do something like this, and if there's a way to do it without implementing another large framework.
Sorry if anything is unclear or confusing, I'm just a lowly intern who started learning Spring and Spring Batch last month and I'm far from completely understanding everything, especially this scaling stuff. Just ask and I'll try to clear things up.
Thanks for any help!
Take a look at the 'spring-batch-integration' project under the spring-batch-admin umbrella project https://github.com/SpringSource/spring-batch-admin
It has a number of examples of using spring-integration to distribute work to other nodes. IN particular see the chunk and partition packages. Just swap out the spring integration channels with jms channel adapters. By distributing work partitions via JMS, you can scale out the number of worker nodes as needed.
There are a number of threads on this subject in the spring integration forum; search for 'PartitionHandler'.
Hope that helps.

Resources