I am trying to modify the accepted solution to this question here -https://stackoverflow.com/questions/62362298/run-procedures-in-parallel-oracle-pl-sql
such that -
When a program (or Stored procedure) as part of a chain step finishes, it immediately restarts for the next invocation.
I am basically trying to create a way to parallelly run jobs continously. The accepted solution works for a single execution of parallel jobs. But I am unsure how to keep these jobs running indefinitely.
So far, I have read the Scheduler documentation and it seems maybe a rule with evaluation_interval can be used?.. Not sure..
Related
I'm looking for a way to monitor several long running jobs in a session in another parallel job, however there is no way to pass the current session's jobs (Get-Job) as a parameter into another job, unless I assign them to a variable and I process them singularly in a pipeline, which is time consuming.
I might need to end up doing something like this, even though it keeps the session busy: https://gallery.technet.microsoft.com/scriptcenter/Monitor-and-display-808ce573 , however the downside of this approach is that I cannot stop jobs and/or interact with them in any way until all of them area completed/failed, thats why I was trying to find a solution in a parallel monitoring job.
I have a shell script which will trigger a PL/SQL report generation procedure after certain pre-conditions are satisfied. The logic for checking whether the pre-conditions are fulfilled is written in PL/SQL package. The report generation needs to wait until the pre-conditions are not fulfilled.
What are the pros and cons of waiting using dbms_lock.sleep inside PL/SQL procedure vs UNIX sleep?
Like a lot of design decisions the answer is, it depends.
Database connections are expensive and relatively time consuming operations. So probably the more efficient approach would be to connect to the database once and let the PL/SQL job handle the waiting process.
Also it's probably cleaner to have a simple PL/SQL call and let the database handle the report or sleep logic rather than write an API that returns a state which the calling program must interpret and act on. This also gives you a neater path to alternative execution (say by calling from a GUI or a DBMS_SCHEDULER job).
There are two specific advantages of using a shell script sleep:
You have the option of emitting a status every time the loop enters sleep mode (if this is interactive)
Execute on sys.dbms_lock is not granted to anybody by default. Some DBAs can be reluctant to grant execute on that package.
A certain number of jobs needs to be executed in a sequence, such that result of one job is input to another. There's also a loop in one part of job chain. Currently, I'm running this sequency using wait for completition, but I'm going to start this sequence from web service, so I don't want to get stuck waiting for response. I wan't to start the sequence and return.
How can I do that, considering that job's depend on each other?
The typical approach I follow is to use Oozie work flow to chain the sequence of jobs with passing the dependent inputs to them accordingly.
I used a shell script to invoke the oozie job .
I am not sure about the loops within the oozie workflow. but the below link speaks about the way to implement loops within the workflow.Hope it might help you.
http://zapone.org/bernadette/2015/01/05/how-to-loop-in-oozie-using-sub-workflow/
Apart from this the JobControl class is also a good option if the jobs need to be in sequence and it requires less efforts to implement.It would be easy to do loop since it would be fully done with Java code.
http://gandhigeet.blogspot.com/2012/12/hadoop-mapreduce-chaining.html
https://cloudcelebrity.wordpress.com/2012/03/30/how-to-chain-multiple-mapreduce-jobs-in-hadoop/
I have read the documentation so I know the difference.
My question however is that, is there any risk in using .submit instead of .waitForComplete if I want to run several Hadoop jobs on a cluster in parallel ?
I mostly use Elastic Map Reduce.
When I tried doing so, I noticed that only the first job being executed.
If your aim is to run jobs in parallel then there is certainly no risk in using job.submit(). The main reason job.waitForCompletion exists is that it's method call returns only when the job gets finished, and it returns with it's success or failure status which can be used to determine that further steps are to be run or not.
Now, getting back at you seeing only the first job being executed, this is because by default Hadoop schedules the jobs in FIFO order. You certainly can change this behaviour. Read more here.
I'm using LoadLeveler to submit jobs on an IBM/BlueGene architecture. I read the documentation made from IBM and also gave Google a try, but I cannot find how to do the following, which I expect should be there:
One can use the
queue
keyword to tell LoadLeveler that a new job step is described, so I could do something like
first_step
queue
second_step
queue
but what I fail to find is a way that does something like
loop job_id = 1,10
do_job_with_given_job_id
end
Do I have to write a "normal" shell script that in turn calls a load level script for a bunch of times, or is there some built in loop mechanism? I know that other job managers can do this.
When this comes up, we normally just recommend that one writes a shell script which generates the job submission script or scripts; that's what I do for my own jobs. Do these steps have dependancies on each other?
Also, just out of curiosity, which schedulers/resource managers can queue multiple jobs within a loop in a submission script? Not the PBS-based ones...