Scheduled termination of nomad job - nomad

I'm looking for analogue of nomad stop <job> command which I can schedule from the job itself. Is there such a thing in Nomad? I failed to find anything which can do it. It looks like Nomad fundamentally only starting jobs, but stopping them is not something which can be specified in the definition.
The only idea I have is to have two jobs, one to start and another is to stop. The 'stop' one will use command line to stop the other one. Is that the right way? Wondering how everyone else is doing this.

You can always just have your job exit with code 0, jobs don't need to run forever. Nomad will then mark the job as "complete" and you're done.
If the job itself can't control its own exit then yes, you might need to try and set up some external job that can make api calls to stop the task, but I'd recommend trying to just have the job exit when it has completed its task.

Nomad have different schedulers for different type of jobs.
For you requirement, you can use Nomad Batch job.
The default job type is service so if you don't mention any job type it will run as a service job which is expected to not exit unless stopped explicitly. You can mention the job type in the Nomad job file as follows:
job "sample" {
type = "batch"
# ...
}

Related

How to execute same job cocurrently in Spring xd

We have the below requirement,
In spring xd, we have a job lets assume the job name as MyJob
which will be invoked by another process using the rest service of spring xd, lets assume process name as OutsideProcess (non-spring xd process).
OutsideJob invokes MyJob when ever a file added to a location (lets assume FILES_LOC) to which OutsideJob is listening.
In this scenario, lets assume that MyJob takes 5minutes to complete the job.
At 10:00 AM, there is a file copied to FILES_LOC, then OutsideProcess will trigger MyJob immediately. (approximately it will be completed at 10:05AM)
At 10:01 AM, another file copied to FILES_LOC, then OutsideProcess will trigger one more instance of MyJob at 10:01AM. But the second instance is getting queued and starts the execution once the first instance completes its execution (approximately at 10:05AM).
If we invoke the different jobs at the same time they are getting executed concurrenctly, but the same job multiple instances are not getting executed concurrenctly.
Please let me know how can I execute the same job with multiple instances concurrently.
Thanks in advance.
The only thing I can think of is dynamic deployment of the job and triggering it right away. You can use SpringXD Rest template to create the job definition on the fly and launch them after sleeping a few seconds. And make sure you undeploy/destroy the job when the job completes successfully.
Another solution could be to create a few module instances of your job with different names and use them as your slave processes. You can query status of these job module instances and launch the one that is finished or queue the one that is least recently launched.
Remember you can run jobs with partition support if applicable. This way you will finish your job faster and be able to run more jobs.

"single point of failure" job tracker node goes down and Map jobs are either running or writing the output

I'm new to hadoop and I would like to know what happens when a "single point of failure" job tracker node goes down and Map jobs are either running or writing the output. Would the Jobtracker starts all mapjobs all over again ?
Job tracker is a single point of failure meaning if it goes down you wont be able to submit any additional Map/reduce jobs and existing jobs would be killed.
When you restart your job tracker, you would need to resubmit whole job again.

What is the difference between job.submit and job.waitForComplete in Apache Hadoop?

I have read the documentation so I know the difference.
My question however is that, is there any risk in using .submit instead of .waitForComplete if I want to run several Hadoop jobs on a cluster in parallel ?
I mostly use Elastic Map Reduce.
When I tried doing so, I noticed that only the first job being executed.
If your aim is to run jobs in parallel then there is certainly no risk in using job.submit(). The main reason job.waitForCompletion exists is that it's method call returns only when the job gets finished, and it returns with it's success or failure status which can be used to determine that further steps are to be run or not.
Now, getting back at you seeing only the first job being executed, this is because by default Hadoop schedules the jobs in FIFO order. You certainly can change this behaviour. Read more here.

Quartz one time job on application startup

I am trying to integrate a Quartz job in my spring application. I got this example from here. The example shows jobs executing at repeated intervals using a simpletrigger and at a specific time using a crontrigger.
My requirement is to run the job only once on application startup. I removed the property repeatInterval, but the application throws an exception :
org.quartz.SchedulerException: Repeat Interval cannot be zero
Is there any way to schedule a job just once ?
Thanks..
Found the answer here
Ignoring the repeatInterval and setting repeatCount = 0 does what I wanted.
Spring SimpleTriggerFactoryBean does the job: if you don't specify the start time, it will set it to 'now'.
Yet I think that long-running one-time job should be considered an anti-pattern, since it will not work even in 2-node cluster: if the node that runs the job goes down, there will be no one that would restart the job.
I prefer to have a job that repeats e.g. every hour, but annotated with #DisallowConcurrentExecution. This way you guarantee that precisely one job will be running, both when the node that originally hosted the job is up, and after it goes down.

Hadoop reuse Job object

I have a pool of Jobs from which I retrieve jobs and start them. The pattern is something like:
Job job = JobPool.getJob();
job.waitForCompletion();
JobPool.release(job);
I get a problem when I try to reuse a job object in the sense that it doesn't even run (most probably because it's status is : COMPLETED). So, in the following snippet the second waitForCompletion call prints the statistics/counters of the job and doesn't do anything else.
Job jobX = JobPool.getJob();
jobX.waitForCompletion();
JobPool.release(jobX);
//.......
Job jobX = JobPool.getJob();
jobX.waitForCompletion(); // <--- here the job should run, but it doesn't
Am I right when I say that the job doesn't actually run because hadoop sees its status as completed and it doesn't event try to run it ? If yes, do you know how to reset a job object so that I can reuse it ?
The Javadoc includes this hint that the jobs should only run once
The set methods only work until the job is submitted, afterwards they will throw an IllegalStateException.
I think there's some confusion about the job, and the view of the job. The latter is the thing that you have got, and it is designed to map to at most one job running in hadoop. The view of the job is fundamentally light weight, and if creating that object is expensive relative to actually running the job... well, I've got to believe that your jobs are simple enough that you don't need hadoop.
Using the view to submit a job is potentially expensive (copying jars into the cluster, initializing the job in the JobTracker, and so on); conceptually, the idea of telling the jobtracker to "rerun " or "copy ; run ", makes sense. As far as I can tell, there's no support for either of those ideas in practice. I suspect that hadoop isn't actually guaranteeing retention policies that would support either use case.

Resources