I'd like to find the best way to handle exceptions (failure of any steps) from an Oracle scheduler job chain (11gR2).
Say I have a chain that contains 20 steps. If at any point the chain exits with FAILURE, I'd like to do a set of actions. These actions are specific to that chain, not the individual steps (each step's procedure may be used outside of scheduler or in other chains).
Thanks to 11gR2, I can now setup an email notification on FAILURE of chain, but this is only 1 of several actions I need to do, so its only a partial solution for me.
The only thing I can think of is have another polling job check the status of my chain every x minutes and launch the failure actions when it sees the latest job of the chain exited with FAILURE status. But this is a hack at best imo.
What is the best way to handle exceptions for a given job chain?
thanks
The most flexible way to handle jobs exceptions in general is to use a job exception monitoring procedure and define the jobs to generate events upon job status changes. The job exception monitoring procedure should watch the scheduler event queue in a loop and react upon events in a way you define.
Doing so takes away the burden to have to create failure steps for about each and every job step in a chain. This is a very powerful mechanism.
by lack of time: in the book is a complete scenario of event based scheduling. Will dig one up later.
Related
I need to orchestrate multiple jobs in parallel (each using Lambdas in AWS), ensure all finish by retrying individual jobs as needed, and then update a state only when all jobs have completed successfully. It will look like this:
This image was taken from the Step Functions documentation. I'm considering whether Step Functions might be an answer.
Any thoughts on how this might look using Lambda (with or without Step Functions)? I'm guessing deadletter queues might be involved to facilitate retries? My biggest unknown is how to update the final state only after all jobs complete and considering whether retries may have occurred.
You are correctly, using AWS Step Functions do resolve your problem.
But, as you are looking for other approaches using pure lambda you will need a state persistence, as the lambda doesn't have that over different functions.
Create a data structure that will be checked at the end of each lambda execution, e.g boolean attributes that corresponds to each process that have to be executed
At the end of each process (lambda execution), change the attribute related to that lambda process to true, than verify if all the attributes are true, if yes you can invoke the lambda responsible to the next step of your pipeline.
If you need retry when errors came up, implement a DLQ and you can have more control of it.
My requirement is whenever we call certain RestAPI from UI/Postman, At backend it should trigger JOB that perform several operations/task.
Example:
Assume some POST Rest API is invoked -
It should invoke "Identify-JOB"(performs several activities)- Based on certain condition, It should invoke PLANA-JOB or PLANB-JOB
1> Suppose PLANA-JOB is invoked, on the success of this JOB, it should trigger another JOB called "finish-JOB". On the failure it should not invoke another JOB "finish-JOB"
Can you please help here how can i do this?
You can use async processing and that'll trigger the first job and that task will trigger the next set of tasks.
You can build them like AWS step functions
You can use Rqueue to enqueue async task and that'll be processed by one of the listeners.
I've been trying to google and search stack for the answer but have beeen unable to find.
Using NiFi, is it possible to stop a process upon previous job failure?
We have user data we need to process but the data is sequentially constructed so that if a job fails, we need to stop further jobs from running.
I understand we can create scripts to fail a process upon previous process failure, but what if I need entire group to halt upon failure, is this possible? We don't want each job in queue to follow failure path, we want it to halt until we can look at the data and analyze the failure.
TL;DR - can we STOP a process upon a failure, not just funnel all remaining jobs into the failure flow. We want data in queues to wait until we fix, thus stop process, not just fail again and again.
Thanks for any feedback, cheers!
Edit: typos
You can configure backpressure on the queues to stop upstream processes. If you set the backpressure threshold to 1 on a failure queue, it would effectively stop the processor until you had a chance to address the failure.
The screenshot shows failure routing back to the processor, but this is not required. It is important that the next processor should not remove it from the queue to maintain the backpressure until you take action.
I am using Laravel 5.1, and I have a task that takes around 2 minutes to process, and this task particularly is generating a report...
Now, it is obvious that I can't make the user wait for 2 minutes on the same page where I took user's input, instead I should process this task in the background and notify the user later about task completion...
So, to achieve this, Laravel provides Queues that runs the tasks in background (If I didn't understand wrong), Now for multi-user environment, i.e. if more than one user demands report generation (say there are 4 users), so being the feature named Queues, does it mean that tasks will be performed one after the other (i.e. when 4 users demand for report generation one after other, then 4th user's report will only be generated when report of 3rd user is generated) ??
If Queues completes their tasks one after other, then is there anyway with which tasks are instantly processed in background, on request of user, and user can get notified later when its task is completed??
Queue based architecture is little complicated than that. See the Queue provides you an interface to different messaging implementations like rabbitMQ, beanstalkd.
Now at any point in code you send send message to Queue which in this context is termed as a JOB. Now your queue will have multiple jobs which are ready to get out as in FIFO sequence.
As per your questions, there are worker which listens to queue, they get a job and execute them. It's up to you how many workers you want. If you have one worker your tasks will be executed one after another, more the workers more the parallel processes.
Worker process are started with command line interface of laravel called Artisan. Each process means one worker. You can start multiple workers with supervisor.
Since you know for sure that u r going to send notification to user after around 2 mins, i suggest to use cron job to check whether any report to generate every 2 mins and if there are, you can send notification to user. That check will be a simple one query so don't need to worry about performance that much.
A certain number of jobs needs to be executed in a sequence, such that result of one job is input to another. There's also a loop in one part of job chain. Currently, I'm running this sequency using wait for completition, but I'm going to start this sequence from web service, so I don't want to get stuck waiting for response. I wan't to start the sequence and return.
How can I do that, considering that job's depend on each other?
The typical approach I follow is to use Oozie work flow to chain the sequence of jobs with passing the dependent inputs to them accordingly.
I used a shell script to invoke the oozie job .
I am not sure about the loops within the oozie workflow. but the below link speaks about the way to implement loops within the workflow.Hope it might help you.
http://zapone.org/bernadette/2015/01/05/how-to-loop-in-oozie-using-sub-workflow/
Apart from this the JobControl class is also a good option if the jobs need to be in sequence and it requires less efforts to implement.It would be easy to do loop since it would be fully done with Java code.
http://gandhigeet.blogspot.com/2012/12/hadoop-mapreduce-chaining.html
https://cloudcelebrity.wordpress.com/2012/03/30/how-to-chain-multiple-mapreduce-jobs-in-hadoop/