how to run the stored procedure in batch mode or in run it in parallel processing - oracle

We are iterating 100k+ records from global temporary table.below stored procedure will iterate all records from glogal temp table one by one and has to process below three steps.
to see whether product is exists or not
to see whether product inside the assets are having the 'category' or not.
to see whether the assets are having file names starts with '%pdf%' or not.
So each record has to process these 3 steps and final document names will be stored in the table for the successful record. If any error comes in any of the steps then error message will be stored for that record.
Below stored procedure is taking long time to process Because its processing sequentially.
Is there any way to make this process faster in the stored procedure itself by doing batch process?
If it's not possible in stored procedure then can we change this code into Java and run this code in multi threaded mode? like creating 10 threads and each thread will take one record concurrently and process this code. I would be happy if somebody gives some pseudo code.
which approach is going to suggest?
DECLARE
V_NODE_ID VARCHAR2(20);
V_FILENAME VARCHAR2(100);
V_CATEGORY_COUNT INTEGER :=0;
FINAL_FILNAME VARCHAR2(2000);
V_FINAL_ERRORMESSAGE VARCHAR2(2000);
CURSOR C1 IS
SELECT isbn FROM GT_ADD_ISBNS GT;
CURSOR C2(v_isbn in varchar2) IS
SELECT ANP.NODE_ID NODE_ID
FROM
table1 ANP,
table2 ANPP,
table3 AN
WHERE
ANP.NODE_ID=AN.ID AND
ANPP.NODE_ID=ANP.NODE_ID AND
AN.NAME_ID =26 AND
ANP.CATEORGY='category' AND
ANP.QNAME_ID='categories' AND
ANP.NODE_ID IN(SELECT CHILD_NODE_ID
FROM TABLE_ASSOC START WITH PARENT_NODE_ID IN(v_isbn)
CONNECT BY PRIOR CHILD_NODE_ID = PARENT_NODE_ID);
BEGIN
--Iterating all Products
FOR R1 IN C1
LOOP
FINAL_FILNAME :='';
BEGIN
--To check whether Product is exists or not
SELECT AN.ID INTO V_NODE_ID
FROM TABLE1 AN,
TABLE2 ANP
WHERE
AN.ID=ANP.NODE_ID AND
ANP.VALUE in(R1.ISBN);
V_CATEGORY_COUNT :=0;
V_FINAL_ERRORMESSAGE :='';
--To check Whether Product inside the assets are having the 'category' is applied or not
FOR R2 IN C2(R1.ISBN)
LOOP
V_CATEGORY_COUNT := V_CATEGORY_COUNT+1;
BEGIN
--In this Logic Product inside the assets have applied the 'category' But those assets are having documents LIKE '%pdf%' or not
SELECT ANP.STRING_VALUE into V_FILENAME
FROM
table1 ANP,
table2 ANPP,
table3 ACD
WHERE
ANP.QNAME_ID=21 AND
ACD.ID=ANPP.LONG_VALUE
ANP.NODE_ID=ANPP.NODE_ID AND
ANPP.QNAME_ID=36 AND
ANP.STRING_VALUE LIKE '%pdf%' AND
ANP.NODE_ID=R2.NODE_ID;
FINAL_FILNAME := FINAL_FILNAME || V_FILENAME ||',';
EXCEPTION WHEN
NO_DATA_FOUND THEN
V_FINAL_ERRORMESSAGE:=V_FINAL_ERRORMESSAGE|| 'Category is applied for this Product But for the asset:'|| R2.NODE_ID || ':Documents[LIKE %pdf%] were not found ;';
UPDATE GT_ADD_ISBNS SET ERROR_MESSAGE= V_FINAL_ERRORMESSAGE WHERE ISBN= R1.ISBN;
END;--Iterating for each NODEID
END LOOP;--Iterating the assets[Nodes] for each product of catgeory
-- DBMS_OUTPUT.PUT_LINE('R1.ISBN:' || R1.ISBN ||'::V_CATEGORY_COUNT:' || V_CATEGORY_COUNT);
IF(V_CATEGORY_COUNT = 0) THEN
UPDATE GT_ADD_ISBNS SET ERROR_MESSAGE= 'Category is not applied to none of the Assets for this Product' WHERE ISBN= R1.ISBN;
END IF;
EXCEPTION WHEN
NO_DATA_FOUND THEN
UPDATE GT_ADD_ISBNS SET ERROR_MESSAGE= 'Product is not Found:' WHERE ISBN= R1.ISBN;
END;
-- DBMS_OUTPUT.PUT_LINE( R1.ISBN || 'Final documents:'||FINAL_FILNAME);
UPDATE GT_ADD_ISBNS SET FILENAME=FINAL_FILNAME WHERE ISBN= R1.ISBN;
COMMIT;
END LOOP;--looping gt_isbns
END;

You have a number of potential performance hits. Here's one:
"We are iterating 100k+ records from global temporary table"
Global temporary tables can be pretty slow. Populating them means writing all that data to disk; reading from them means reading from disk. That's a lot of I/O which might be avoidable. Also, GTTs use the temporary tablespace so you may be in contention with other sessions doing large sorts.
Here's another red flag:
FOR R1 IN C1 LOOP
... FOR R2 IN C2(R1.ISBN) LOOP
SQL is a set-based language. It is optimised for joining tables and returning sets of data in a highly-performative fashion. Nested cursor loops mean row-by-row processing which is undoubtedly easier to code but may be orders of magnitude slower than the equivalent set operation would be.
--To check whether Product is exists or not
You have several queries selecting from the same tables (AN, 'ANP) using the same criteria (isbn`). Perhaps all these duplicates are the only way of validating your business rules but it seems unlikely.
FINAL_FILNAME := FINAL_FILNAME || V_FILENAME ||',';
Maybe you could rewrite your query to use listagg() instead of using procedural logic to concatenate a string?
UPDATE GT_ADD_ISBNS
Again, all your updates are single row operations instead of set ones.
"Is there any way to make this process faster in the stored procedure itself by doing batch process?"
Without knowing your rules and the context we cannot rewrite your logic for you, but 15-16 hours is way too long for this so you can definitely reduce the elapsed time.
Things to consider:
Replace the writing and reading to the temporary table with the query you use to populate it
Rewrite the loops to use BULK COLLECT with a high LIMIT (e.g. 1000) to improve the select efficiency. Find out more.
Populate arrays and use FORALL to improve the efficiency of the updates. Find out more.
Try to remove all those individual look-ups by incorporating the logic into the main query, using OUTER JOIN syntax to test for existence.
These are all guesses. If you really want to know where the procedure is spending the time - and that knowledge is the root of all successful tuning, so you ought to want to know - you should run the procedure under a PL/SQL Profiler. This will tell you which lines cost the most time, and those are usually the ones where you need to focus your tuning effort. If you don't already have access to DBMS_PROFILER you will need a DBA to run the install script for you. Find out more.
" can we change this code into Java and run this code in multi threaded mode?"
Given that one of the reasons for slowing down the procedure is the I/O cost of selecting from the temporary table there's a good chance multi-threading might introduce further contention and actually make things worse. You should seek to improve the stored procedure first.

Related

How to delete huge rows in oracle table using parallel sessions query quickly

I am using mentioned query to delete 250 million plus rows from my table and it is taking more time
I have tried with t_delete limit with up to 20000.
Still slow deletion happening.
Please suggest a few optimisations in the same code to done my job faster.
DECLARE
TYPE tt_delete IS TABLE OF ROWID; t_delete tt_delete;
CURSOR cIMAV IS SELECT ROWID FROM moc_attribute_value where id in (select
id from ORPHANS_MAV);
total Number:=0;
rcount Number:=0;
Stmt1 varchar2(2000);
Stmt2 varchar2(2000);
BEGIN
--- CREATE TABLE orphansInconsistenDelProgress (currentTable
VARCHAR(100), deletedCount INT, totalToDelete INT);
--- INSERT INTO orphansInconsistenDelProgress (currentTable,
deletedCount,totalToDelete) values ('',0,0);
Stmt1:='ALTER SESSION SET parallel_degree_policy = AUTO';
Stmt2:='ALTER SESSION FORCE PARALLEL DML';
EXECUTE IMMEDIATE Stmt1;
EXECUTE IMMEDIATE Stmt2;
--- ALTER SESSION SET parallel_degree_policy = AUTO;
--- ALTER SESSION FORCE PARALLEL DML;
COMMIT;
--- MOC_ATTRIBUTE_VALUE
SELECT count(*) INTO total FROM ORPHANS_MAV;
UPDATE orphansInconsistenDelProgress SET currentTable='ORPHANS_MAV',
totalToDelete=total;
rcount := 0;
OPEN cIMAV;
LOOP
FETCH cIMAV BULK COLLECT INTO t_delete LIMIT 2000;
EXIT WHEN t_delete.COUNT = 0;
FORALL i IN 1..t_delete.COUNT
DELETE moc_attribute_value WHERE ROWID = t_delete (i);
COMMIT;
rcount := rcount + 2000;
UPDATE orphansInconsistenDelProgress SET deletedCount=rcount;
END LOOP;
CLOSE cIMAV;
COMMIT;
END;
/
A single Oracle parallel query can simplify the code and improve performance.
declare
execute immediate 'alter session enable parallel dml';
delete /*+ parallel */
from moc_attribute_value
where id in (select id from ORPHANS_MAV);
update OrphansInconsistenDelProgress
set currentTable = 'ORPHANS_MAV',
totalToDelete = sql%rowcount;
commit;
end;
/
In general, we want to either let Oracle break the task into pieces or use our own custom chunking. The original code seems to be doing both - it reads the data in chunks, and then submits each chunk to be further divided into a parallel delete. That approach generates lots of tiny pieces, and Oracle likely wastes a lot of time on things like thread coordination.
Deleting a large number of rows is expensive because there's no way to avoid REDO and UNDO. You might want to look into using DDL options, such as truncating a partition, or dropping and recreating the table. (But be careful recreating objects, it's difficult to perfectly recreate complex objects. We tend to forget things like privileges and table options.)
Tuning parallelism and large jobs is complicated. It's important to use the best monitoring tools, to ensure that Oracle is requesting, allocating, and using the right number of parallel processes, and that the execution plan is correct. One strong advantage of using a single SQL statement is that you can use real-time SQL monitoring reports to monitor progress. If you have the Diagnostics and Tuning Pack licenses, find the SQL_ID in GV$SQL and generate the report with select dbms_sqltune.report_sql_monitor('your SQL_ID here');.
maybe use SQL TRUNCATE TABLE,
Truncate table is faster and uses lesser resources than DELETE TABLE command.
If you are keeping only a fraction of the rows, it is likely to be much faster to copy over the rows to keep, then swap tables and delete the old table.
(No I don't know the threshold at which this is faster than DELETEing.)

How to create a table and insert in the same statement

I'm new to Oracle and I'm struggling with this:
DECLARE
cnt NUMBER;
BEGIN
SELECT COUNT(*) INTO cnt FROM all_tables WHERE table_name like 'Newtable';
IF(cnt=0) THEN
EXECUTE IMMEDIATE 'CREATE TABLE Newtable ....etc';
END IF;
COMMIT;
SELECT COUNT(*) INTO cnt FROM Newtable where id='something'
IF (cnt=0) THEN
EXECUTE IMMEDIATE 'INSERT INTO Newtable ....etc';
END IF;
END;
This keeps crashing and gives me the "PL/SQL: ORA-00942:table or view does not exist" on the insert-line. How can I avoid this? Or what am I doing wrong? I want these two statements (in reality it's a lot more of course) in a single transaction.
It isn't the insert that is the problem, it's the select two lines before. You have three statements within the block, not two. You're selecting from the same new table that doesn't exist yet. You've avoided that in the insert by making that dynamic, but you need to do the same for the select:
EXECUTE IMMEDIATE q'[SELECT COUNT(*) FROM Newtable where id='something']'
INTO cnt;
SQL Fiddle.
Creating a table at runtime seems wrong though. You said 'for safety issues the table can only exist if it's filled with the correct dataset', which doesn't entirely make sense to me - even if this block is creating and populating it in one go, anything that relies on it will fail or be invalidated until this runs. If this is part of the schema creation then making it dynamic doesn't seem to add much. You also said you wanted both to happen in one transaction, but the DDL will do an implicit commit, you can't roll back DDL, and your manual commit will start a new transaction for the insert(s) anyway. Perhaps you mean the inserts shouldn't happen if the table creation fails - but they would fail anyway, whether they're in the same block or not. It seems a bit odd, anyway.
Also, using all_tables for the check could still cause this to behave oddly. If that table exists in another schema, you create will be skipped, but you select and insert might still fail as they might not be able to see, or won't look for, the other schema version. Using user_tables or adding an owner check might be a bit safer.
Try the following approach, i.e. create and insert are in two different blocks
DECLARE
cnt NUMBER;
BEGIN
SELECT COUNT (*)
INTO cnt
FROM all_tables
WHERE table_name LIKE 'Newtable';
IF (cnt = 0)
THEN
EXECUTE IMMEDIATE 'CREATE TABLE Newtable(c1 varchar2(256))';
END IF;
END;
DECLARE
cnt2 NUMBER;
BEGIN
SELECT COUNT (*)
INTO cnt2
FROM newtable
WHERE c1 = 'jack';
IF (cnt2 = 0)
THEN
EXECUTE IMMEDIATE 'INSERT INTO Newtable values(''jill'')';
END IF;
END;
Oracle handles the execution of a block in two steps:
First it parses the block and compiles it in an internal representation (so called "P code")
It then runs the P code (it may be interpreted or compiled to machine code, depending on your architecture and Oracle version)
For compiling the code, Oracle must know the names (and the schema!) of the referenced tables. Your table doesn't exist yet, hence there is no schema and the code does not compile.
To your intention to create the tables in one big transaction: This will not work. Oracle always implicitly commits the current transaction before and after a DDL statement (create table, alter table, truncate table(!) etc.). So after each create table, Oracle will commit the current transaction and starts a new one.

Nested cursor in a cursor

I have a cursor which is
CURSOR B_CUR IS select DISTINCT big_id from TEMP_TABLE;
This would return multiple values. Earlier it was being used as
FOR b_id IN B_CUR LOOP
select s.col1, s.col2 INTO var1, var2 from sometable s where s.col3 = b_id.col1;
END LOOP;
Earlier it was certain that the inner select query would always return 1 row. Now this query can return multiple rows. How can I change this logic?
I was thinking to create a nested cursor which will fetch into an array of record type (which i will declare) but I have no idea how nested cursor would work here.
My main concern is efficiency. Since it would be working on millions of records per execution. Could you guys suggest what would be the best approach here?
Normally, you would just join the two tables.
FOR some_cursor IN (SELECT s.col1,
s.col2
FROM sometable s
JOIN temp_table t ON (s.col3 = t.col1))
LOOP
<<do something>>
END LOOP
Since you are concerned about efficiency, however
Is TEMP_TABLE really a temporary table? If so, why? It is exceedingly rare that Oracle actually needs to use temporary tables so that leads me to suspect that you're probably doing something inefficient to populate the temporary table in the first place.
Why do you have a cursor FOR loop to process the data from TEMP_TABLE? Row-by-row processing is the slowest way to do anything in PL/SQL so it would generally be avoided if you're concerned about efficiency. From a performance standpoint, you want to maximize SQL so that rather than doing a loop that did a series of single-row INSERT or UPDATE operations, you'd do a single INSERT or UPDATE that modified an entire set of rows. If you really need to process data in chunks, that's where PL/SQL collections and bulk processing would come in to play but that will not be as efficient as straight SQL.
Why do you have the DISTINCT in your query against TEMP_TABLE? Do you really expect that there will be duplicate big_id values that are not erroneous? Most of the time, people use DISTINCT incorrectly either to cover up problems where data has been joined incorrectly or where you're forcing Oracle to do an expensive sort just in case incorrect data gets created in the future when a constraint would be the more appropriate way to protect yourself.
FOR b_id IN B_CUR LOOP
for c_id in (select s.col1, s.col2 INTO var1, var2 from sometable s where s.col3 = b_id.col1)loop
......
end loop;
END LOOP;

bulk collect in oracle

How to query bulk collection? If for example I have
select name
bulk collect into namesValues
from table1
where namesValues is dbms_sql.varchar2_table.
Now, I have another table XYZ which contains
name is_valid
v
h
I want to update is_valid to 'Y' if name is in table1 else 'N'. Table1 has 10 million rows. After bulk collecting I want to execute
update xyz
set is_valid ='Y'
where name in namesValue.
How to query namesValue? Or is there is another option. Table1 has no index.
please help.
As Tom Kyte (Oracle Corp. Vice President) says:
My mantra, that I'll be sticking with thank you very much, is:
You should do it in a single SQL statement if at all possible.
If you cannot do it in a single SQL Statement, then do it in PL/SQL.
If you cannot do it in PL/SQL, try a Java Stored Procedure.
If you cannot do it in Java, do it in a C external procedure.
If you cannot do it in a C external routine, you might want to
seriously think about why it is you need to do it…
think in sets...
learn all there is to learn about SQL...
You should perform your update in SQL if you can. If you need to add an index to do this then that might be preferable to looping through a collection populated with BULK COLLECT.
If however, this is some sort of assignment....
You should specify it as such but here's how you would do it.
I have assumed that your DB server does not have the capacity to hold 10 million records in memory so rather than BULK COLLECTing all 10 million records in one go I have put the BULK COLLECT into a loop to reduce your memory overheads. If this is not the case then you can omit the bulk collect loop.
DECLARE
c_bulk_limit CONSTANT PLS_INTEGER := 500000;
--
CURSOR names_cur
IS
SELECT name
FROM table1;
--
TYPE namesValuesType IS TABLE OF table1.name%TYPE
INDEX BY PLS_INTEGER;
namesValues namesValuesType;
BEGIN
-- Populate the collection
OPEN name_cur;
LOOP
-- Fetch the records in a loop limiting them
-- to the c_bulk_limit amount at a time
FETCH name_cur BULK COLLECT INTO namesValues
LIMIT c_bulk_limit;
-- Process the records in your collection
FORALL x IN INDICES OF namesValues
UPDATE xyz
SET is_valid ='Y'
WHERE name = namesValue(x)
AND is_valid != 'Y';
-- Set up loop exit criteria
EXIT WHEN namesValues.COUNT < c_bulk_limit;
END LOOP;
CLOSE name_cur;
-- You want to update all remaining rows to 'N'
UPDATE xyz
SET is_valid ='N'
WHERE is_valid IS NULL;
EXCEPTION
WHEN others
THEN
IF name_cur%ISOPEN
THEN
CLOSE name_cur;
END IF;
-- Re-raise the exception;
RAISE;
END;
/
Depending upon your rollback segment sizes etc. you may want to issue interim commits within the bulk collect loop but be aware that you will not then be able to rollback these changes. I deliberately haven't added any COMMITs to this so you can choose where to put them to suit your system.
You also might want to change the size of the c_bulk_limit constant depending upon the resources available to you.
Your update will still cause you problems if the xyz table is large and there is no index on the name column.
Hope it helps...
"Table1 has no index."
Well there's your problem right there. Why not? Put an index on TABLE1.NAME and use a normal SQL UPDATE to amend the data in XYZ.
Trying to solve this problem with bulk collect is not the proper approach.

Performance problem with Oracle BULK FETCH and FORALL insert

I am trying to copied records from one table to another as fast as possible.
Currently I have a simple cursor loop similiar to this:
FOR rec IN source_cursor LOOP
INSERT INTO destination (a, b) VALUES (rec.a, rec.b)
END LOOP;
I want to speed it up to be super fast so am trying some BULK operations (a BULK FETCH, then a FORALL insert):
Here is what I have for the bulk select / forall insert.
DECLARE
TYPE t__event_rows IS TABLE OF _event%ROWTYPE;
v__event_rows t__event_rows;
CURSOR c__events IS
SELECT * FROM _EVENT ORDER BY MESSAGE_ID;
BEGIN
OPEN c__events;
LOOP
FETCH c__events BULK COLLECT INTO v__event_rows LIMIT 10000; -- limit to 10k to avoid out of memory
EXIT WHEN c__events%NOTFOUND;
FORALL i IN 1..v__event_rows.COUNT SAVE EXCEPTIONS
INSERT INTO destinatoin
( col1, col2, a_sequence)
VALUES
( v__event_rows(i).col1, v__event_rows(i).col2, SOMESEQEUENCE.NEXTVAL );
END LOOP;
CLOSE c__events;
END;
My problem is that I'm not seeing any big gains in performance so far. From what I read it should be 10x-100x faster.
Am I missing a bottleneck here somewhere?
The only benefit your code has over a simple INSERT+SELECT is that you save exceptions, plus (as Justin points out) you have a pointless ORDER BY which is making it do a whole lot of meaningless work. You then don't have any code to do anything with the exceptions that were saved, anyway.
I'd just implement it as a INSERT+SELECT.
You donot have to use loops unnecessarily until it is required in the coding itself.

Resources