I have an Activity table which is getting all the table events of the system. Events like new orders, insertion/deletion on all the system tables will be inserted into this table. So, the no of events/sec is really huge for Activity table.
Now, I want to process the incoming events based on the business logic depending on the table responsible for raising the event. Every table may have different procedure to do the processing.
I used the same link
Parallelizing calls in PL/SQL
As a solution I have created multiple dbms_scheduler jobs which will be called at the same time. All these jobs (JOB1, JOB2--- - -JOB10) will have the same procedure (ProcForAll_Processing) as JOB_ACTION to achieve parallel processing.
begin
dbms_scheduler.run_job('JOB1',false);
dbms_scheduler.run_job('JOB2',false);
end;
ProcForAll_Processing: This procedure in turn will call 6 other procedures
Proc1,proc2,proc3 --- -- - -- - Proc6 in sequential manner. I want to achieve parallel processing for these as well.
P.S: We can’t create further jobs to achieve parallel processing in ProcForAll_Processing proc as it may lead to consume further resources and also DBA is not agreeing for creating further jobs. Also, I can't use
dbms_parallel_execute for parallel processing.
Please help me as I am really stuck to get it done
It is impossible in general case without jobs, and it will make multiple sessions for this. There is no such thing as multithreading PL\SQL with a few exceptions. One of them is parallel execution of sql statements [1]. So there are some attempts to abuse this stuff for parallel execution of PL\SQL code, for example try to look here [2].
But as i've said it's abuse IMHO.
Reference:
https://docs.oracle.com/cd/B19306_01/server.102/b14223/usingpe.htm
http://www.williamrobertson.net/documents/parallel-plsql-launcher.html
Get a new DBA. Or even better, cut them out of any decision making processes. A DBA should not review your code and should not tell you to not create jobs, unless there is a good, specific reason.
Using DBMS_SCHEDULER to run things in parallel is by far the easiest and most common way to achieve this result. Of course it's going to consume more resources, that's what parallelism will inevitably do.
Another, poorer option, is to use parallel pipelined table functions. It's an advanced PL/SQL feature that can't be easily explained in a simple example. The best I can do is refer you to the manual.
You should try to use DBMS_PARALLEL_EXECUTE (since RDBMS 11).
https://blogs.oracle.com/warehousebuilder/entry/parallel_processing_in_plsql
https://oracle-base.com/articles/11g/dbms_parallel_execute_11gR2
Related
I have a job which picks a record from a cursor and then it calls a stored procedure which processes the record picked up from the cursor.
The stored procedure has multiple queries to process the records. In all, procedure takes about 0.3 seconds to process a single record picked up by the cursor but since cursor contains more than 100k records it takes hours to complete the job.
The queries in the stored procedure are all optimized
I was thinking of making the procedure run in multi threaded way as in java and other programming language.
Can it be done in oracle? or is there any other way I can reduce the run time of my job.
I agree with the comments regarding processing cursors in a loop. As Tom Kyte often said "Row at a time [processing] is slow at a time"; Oracle performs best with set based operations and row-at-a-time operations usually have scalability issues (i.e. very susceptible to poor performance when things change on the DB such as CPU capacity, workload, number of records that need processing, changes in size of underlying tables, ...).
You probably already know that Oracle since 8i has a Java VM built in to the DB engine, so you might be able to have java code wrappered as PL/SQL, but this is not for the faint of heart [not saying that you are, just sayin'].
Before going to the trouble of re-writing your application, I would recommend the following tuning approach as it may yield some actionable tunings [assumes diagnostics and tuning pack licenses; won't remove the scalability issues but may lessen the impact of them]:
In versions of oracle 11g and above:
Find the the top level sql id recorded in gv$active_session_history and dba_hist_active_sess_history for the call to the PL/SQL procedure.
Examine the wait events for the sql_id's under that top_level_sql_id. (they tell you what the SQL is waiting on).
Run the tuning advisor on those sql_id's and check for any tuning recommendations. Sometimes if SQL is already sub-second getting it from hundredths of a second to thousandths of a second can have a big impact when call many times.
Run the ADDM report for the period when the procedure is running. Often you will find that heavy PL/SQL processes require increase in PGA. Further, ADDM may advise other relevant actions (e.g. increase SGA, session cached cursors, db writer processes, log buffer, run segment tuning advisor, ...)
I've a scenario were I need to call a set of different Oracle procedures in parallel. These procedures must share the same initial context which has uncommitted transactions. I cannot commit the parent transaction under the danger of having read inconsistency between those parallel processes.
Is it possible in PL/SQL?
One thing comes to my mind: mapreduce with table functions http://blogs.oracle.com/datawarehousing/entry/mapreduce_oracle_tablefunction
I've used this in several scenarios to run things concurrently, though I'm not sure it is applicable to your problem.
You may be able to accomplish this using the DBMS_XA package, which allows you to "switch or share transactions across SQL*Plus sessions or processes using PL/SQL".
Oracle-Base has a good example of how to use the package.
(But if your goal is to use parallelism to improve performance you should use normal statement-level parallel execution instead.)
Far as I know: no.
DBMS_JOB and DBMS_SCHEDULER can be used to run Oracle procedures in parallel but they run them in their own sessions.
I have two PL/SQL systems, residing in two separate databases. SystemA will need to populate SystemB's tables. This will probably be done over a datalink. Everytime a set of records is inserted in SystemB's tables, a process in SystemB must run. I could wait for SystemA to complete and then run a script to start processing in SystemB, but since SystemA could spend many hours processing and then populating SystemB, I'd rather that SystemB handle each set of records as soon as they become available (each set can be processed indpendently of the others so this should work OK).
What I'm not sure of is how I can do even-driven programming in PL/SQL. I'd need SystemA to notify SystemB that a set is ready for processing. My first idea was to have a special "event" table in SystemB and then when SystemA finishes a set, it inserts into the "event" table and there is a trigger on insert that starts the process (and the process could be a long one, possibly 5-10 minutes per process) in SystemB. I don't have enough experience with triggers in Oracle to know if this is an established way of doing it, OR if there's a better mechanism. Suggestions? Tips? Advice?
Use Oracle Advanced Queuing; it's designed for this. I believe you'll still have to set up a database link between the two systems (from B to A in this case, to consume the queue on A).
Yes, Oracle Advance Queues or even having A submit a venerable Oracle Job to B would be a better idea.
And, if your process is going to be needing complete replication of the data from A to B, then you might want to look something like an Oracle Streams process to copy over the data and then do the processing.
I use PHP and Oracle, with crontab executing the PHP scripts at scheduled times. My current logging/auditing solution involves simple log files. I'd like to save my cron execution logs to a database instead.
Right now I'm trying to design the process so that when a cron job starts I create a record in an CronExecution table. Then every time I want to log something for that cron I'll put a record in a CronEvent table which will have a foreign key to the CronExecution table.
I plan to log all events using a PRAGMA AUTONOMOUS pl/sql procedure. With this procedure I will be able to log events inside of other pl/sql procedures and also from my PHP code in a consistent manner.
To make it more robust, my plan is to have a fallback to log errors to a file in the event that the database log calls fail.
Has anybody else written something similar? What suggestions do you have based on your experience?
Yep, I've done this several times.
The CronExecution table is a good idea. However, I don't know that you really need to create the CronEvent table. Instead, just give yourself a "status" column and update that column.
You'll find that makes failing over to file much easier. As you build lots of these CronExecutions, you'll probably have less interest in CronEvents and more interest in the full status of the execution.
Wrap all of the update calls in stored procedures. You definitely have that correct.
Including a 'schedule' in your CronExecution will prove handy. It's very easy to have a lot of cron jobs and not be able to connect the dots on "how long did this take?" and "when is this supposed to run". Including a "next scheduled run" on completion of a job will make your life much easier.
You will want to read the documentation on DBMS_SCHEDULER.
It's a little bit of work to implement (what isn't?), but it will allow you to control scheduled and one-time jobs from within the database. That includes OS-level jobs.
Is there any feature of asynchronous calling in PL/SQL?
Suppose I am in a block of code would like to call a procedure multiple times and wouldn't bother when and what the procedure returns?
BEGIN
myProc(1,100);
myProc(101,200);
myProc(201,300);
...
...
END;
In the above case, I don't want my code to wait for myProc(1,100) to finish processing before executing(101,200)
Thanks.
+1 for DBMS_SCHEDULER and DBMS_JOB approaches, but also consider whether you ought to be using a different approach.
If you have a procedure which executes in a row-by-row manner and you find that it is slow, the answer is probably not to run the procedure multiple times simltaneously but to ensure that a set-based aproach is used instead. At an extreme you can even then use parallel query and parallel DML to reduce the wall clock time of the process.
I mention this only because it is a very common fault.
Submit it in a DBMS_JOB like so:
declare
ln_dummy number;
begin
DBMS_JOB.SUBMIT(ln_dummy, 'begin myProc(1,100); end;');
DBMS_JOB.SUBMIT(ln_dummy, 'begin myProc(101,200); end;');
DBMS_JOB.SUBMIT(ln_dummy, 'begin myProc(201,300); end;');
COMMIT;
end;
You'll need the job_queue_processes parameter set to >0 to spawn threads to process the jobs. You can query the jobs by examining the view user_jobs.
Note that this applies to Oracle 9i, not sure what support 10g has. See more info here.
EDIT: Added missed COMMIT
You may want to look into DBMS_SCHEDULER.
Edited for completeness:
DMBS_SCHEDULER is available on Oracle 10g. For versions before this, DBMS_JOB does approximately the same job.
For more information, see: http://download.oracle.com/docs/cd/B12037_01/server.101/b10739/jobtosched.htm
For PL/SQL Parallel processing you have the following options:
DBMS_SCHEDULER (newer)
DBMS_JOB (older)
Parallel Query
These will let you "emulate" forking and threading in PL/SQL. Of course, using these, you may realize the need to communicate between parallel executed procedures. To do so check out:
Advanced Queuing
DBMS_ALERT
DBMS_PIPE
Personally I've implemented a parallel processing system using DBMS_Scheduler, and used DBMS_Pipe to communicate between "threads". I was very happy with the combination of the two, and my main goal (to reduce major processing times with a particular heavy-weight procedure) was achieved!
You do have another option starting in 11g. Oracle has introduced a package that does something similar to what you want to do, named DBMS_PARALLEL_EXECUTE
According to them, "The DBMS_PARALLEL_EXECUTE package enables the user to incrementally update table data in parallel". A fairly good summary of how to use it is here
Basically, you define a way that Oracle should use to break your job up into pieces (in your case by you seem to be passing some key value), and then it will start each of the pieces individually. There is certainly a little planning and a little extra coding involved in order to use it, but nothing that you shouldn't have been doing anyways.
The advantage of using a sanctioned method such as this is that Oracle even provides database views that can be used to monitor each of the independent threads.
Another way of doing parallel (multi-threaded) PL/SQL is shown here:
http://www.williamrobertson.net/documents/parallel-plsql-launcher.html
The disadvantage of using dbms_job or dbms_schedular is that you don't really know when your tasks are finished. I read that you don't bother but maybe you will change your mind in the future.
EDIT:
This article http://www.devx.com/dbzone/10MinuteSolution/20902/0/page/1 describes another way. It uses dbms_job and dbms_alert. The alerts are used to signal that the jobs are done (callback signal).
Here an explanation of different ways of unloading data to a flat file. One of the ways shows how you can do parallel execution with PL/SQL to speed things up.
http://www.oracle-developer.net/display.php?id=425
The parallel pipelined approach listed here by askTom provides a more complex approach, but you will actually pause until the work is complete, unlike the DBMS Job techniques. That said, you did ask for the "asynchronous" technique, and DBMS_JOB is perfect for that.
Have you considered using Oracle Advaned Queuing?