Nested cursor in a cursor - oracle

I have a cursor which is
CURSOR B_CUR IS select DISTINCT big_id from TEMP_TABLE;
This would return multiple values. Earlier it was being used as
FOR b_id IN B_CUR LOOP
select s.col1, s.col2 INTO var1, var2 from sometable s where s.col3 = b_id.col1;
END LOOP;
Earlier it was certain that the inner select query would always return 1 row. Now this query can return multiple rows. How can I change this logic?
I was thinking to create a nested cursor which will fetch into an array of record type (which i will declare) but I have no idea how nested cursor would work here.
My main concern is efficiency. Since it would be working on millions of records per execution. Could you guys suggest what would be the best approach here?

Normally, you would just join the two tables.
FOR some_cursor IN (SELECT s.col1,
s.col2
FROM sometable s
JOIN temp_table t ON (s.col3 = t.col1))
LOOP
<<do something>>
END LOOP
Since you are concerned about efficiency, however
Is TEMP_TABLE really a temporary table? If so, why? It is exceedingly rare that Oracle actually needs to use temporary tables so that leads me to suspect that you're probably doing something inefficient to populate the temporary table in the first place.
Why do you have a cursor FOR loop to process the data from TEMP_TABLE? Row-by-row processing is the slowest way to do anything in PL/SQL so it would generally be avoided if you're concerned about efficiency. From a performance standpoint, you want to maximize SQL so that rather than doing a loop that did a series of single-row INSERT or UPDATE operations, you'd do a single INSERT or UPDATE that modified an entire set of rows. If you really need to process data in chunks, that's where PL/SQL collections and bulk processing would come in to play but that will not be as efficient as straight SQL.
Why do you have the DISTINCT in your query against TEMP_TABLE? Do you really expect that there will be duplicate big_id values that are not erroneous? Most of the time, people use DISTINCT incorrectly either to cover up problems where data has been joined incorrectly or where you're forcing Oracle to do an expensive sort just in case incorrect data gets created in the future when a constraint would be the more appropriate way to protect yourself.

FOR b_id IN B_CUR LOOP
for c_id in (select s.col1, s.col2 INTO var1, var2 from sometable s where s.col3 = b_id.col1)loop
......
end loop;
END LOOP;

Related

how to run the stored procedure in batch mode or in run it in parallel processing

We are iterating 100k+ records from global temporary table.below stored procedure will iterate all records from glogal temp table one by one and has to process below three steps.
to see whether product is exists or not
to see whether product inside the assets are having the 'category' or not.
to see whether the assets are having file names starts with '%pdf%' or not.
So each record has to process these 3 steps and final document names will be stored in the table for the successful record. If any error comes in any of the steps then error message will be stored for that record.
Below stored procedure is taking long time to process Because its processing sequentially.
Is there any way to make this process faster in the stored procedure itself by doing batch process?
If it's not possible in stored procedure then can we change this code into Java and run this code in multi threaded mode? like creating 10 threads and each thread will take one record concurrently and process this code. I would be happy if somebody gives some pseudo code.
which approach is going to suggest?
DECLARE
V_NODE_ID VARCHAR2(20);
V_FILENAME VARCHAR2(100);
V_CATEGORY_COUNT INTEGER :=0;
FINAL_FILNAME VARCHAR2(2000);
V_FINAL_ERRORMESSAGE VARCHAR2(2000);
CURSOR C1 IS
SELECT isbn FROM GT_ADD_ISBNS GT;
CURSOR C2(v_isbn in varchar2) IS
SELECT ANP.NODE_ID NODE_ID
FROM
table1 ANP,
table2 ANPP,
table3 AN
WHERE
ANP.NODE_ID=AN.ID AND
ANPP.NODE_ID=ANP.NODE_ID AND
AN.NAME_ID =26 AND
ANP.CATEORGY='category' AND
ANP.QNAME_ID='categories' AND
ANP.NODE_ID IN(SELECT CHILD_NODE_ID
FROM TABLE_ASSOC START WITH PARENT_NODE_ID IN(v_isbn)
CONNECT BY PRIOR CHILD_NODE_ID = PARENT_NODE_ID);
BEGIN
--Iterating all Products
FOR R1 IN C1
LOOP
FINAL_FILNAME :='';
BEGIN
--To check whether Product is exists or not
SELECT AN.ID INTO V_NODE_ID
FROM TABLE1 AN,
TABLE2 ANP
WHERE
AN.ID=ANP.NODE_ID AND
ANP.VALUE in(R1.ISBN);
V_CATEGORY_COUNT :=0;
V_FINAL_ERRORMESSAGE :='';
--To check Whether Product inside the assets are having the 'category' is applied or not
FOR R2 IN C2(R1.ISBN)
LOOP
V_CATEGORY_COUNT := V_CATEGORY_COUNT+1;
BEGIN
--In this Logic Product inside the assets have applied the 'category' But those assets are having documents LIKE '%pdf%' or not
SELECT ANP.STRING_VALUE into V_FILENAME
FROM
table1 ANP,
table2 ANPP,
table3 ACD
WHERE
ANP.QNAME_ID=21 AND
ACD.ID=ANPP.LONG_VALUE
ANP.NODE_ID=ANPP.NODE_ID AND
ANPP.QNAME_ID=36 AND
ANP.STRING_VALUE LIKE '%pdf%' AND
ANP.NODE_ID=R2.NODE_ID;
FINAL_FILNAME := FINAL_FILNAME || V_FILENAME ||',';
EXCEPTION WHEN
NO_DATA_FOUND THEN
V_FINAL_ERRORMESSAGE:=V_FINAL_ERRORMESSAGE|| 'Category is applied for this Product But for the asset:'|| R2.NODE_ID || ':Documents[LIKE %pdf%] were not found ;';
UPDATE GT_ADD_ISBNS SET ERROR_MESSAGE= V_FINAL_ERRORMESSAGE WHERE ISBN= R1.ISBN;
END;--Iterating for each NODEID
END LOOP;--Iterating the assets[Nodes] for each product of catgeory
-- DBMS_OUTPUT.PUT_LINE('R1.ISBN:' || R1.ISBN ||'::V_CATEGORY_COUNT:' || V_CATEGORY_COUNT);
IF(V_CATEGORY_COUNT = 0) THEN
UPDATE GT_ADD_ISBNS SET ERROR_MESSAGE= 'Category is not applied to none of the Assets for this Product' WHERE ISBN= R1.ISBN;
END IF;
EXCEPTION WHEN
NO_DATA_FOUND THEN
UPDATE GT_ADD_ISBNS SET ERROR_MESSAGE= 'Product is not Found:' WHERE ISBN= R1.ISBN;
END;
-- DBMS_OUTPUT.PUT_LINE( R1.ISBN || 'Final documents:'||FINAL_FILNAME);
UPDATE GT_ADD_ISBNS SET FILENAME=FINAL_FILNAME WHERE ISBN= R1.ISBN;
COMMIT;
END LOOP;--looping gt_isbns
END;
You have a number of potential performance hits. Here's one:
"We are iterating 100k+ records from global temporary table"
Global temporary tables can be pretty slow. Populating them means writing all that data to disk; reading from them means reading from disk. That's a lot of I/O which might be avoidable. Also, GTTs use the temporary tablespace so you may be in contention with other sessions doing large sorts.
Here's another red flag:
FOR R1 IN C1 LOOP
... FOR R2 IN C2(R1.ISBN) LOOP
SQL is a set-based language. It is optimised for joining tables and returning sets of data in a highly-performative fashion. Nested cursor loops mean row-by-row processing which is undoubtedly easier to code but may be orders of magnitude slower than the equivalent set operation would be.
--To check whether Product is exists or not
You have several queries selecting from the same tables (AN, 'ANP) using the same criteria (isbn`). Perhaps all these duplicates are the only way of validating your business rules but it seems unlikely.
FINAL_FILNAME := FINAL_FILNAME || V_FILENAME ||',';
Maybe you could rewrite your query to use listagg() instead of using procedural logic to concatenate a string?
UPDATE GT_ADD_ISBNS
Again, all your updates are single row operations instead of set ones.
"Is there any way to make this process faster in the stored procedure itself by doing batch process?"
Without knowing your rules and the context we cannot rewrite your logic for you, but 15-16 hours is way too long for this so you can definitely reduce the elapsed time.
Things to consider:
Replace the writing and reading to the temporary table with the query you use to populate it
Rewrite the loops to use BULK COLLECT with a high LIMIT (e.g. 1000) to improve the select efficiency. Find out more.
Populate arrays and use FORALL to improve the efficiency of the updates. Find out more.
Try to remove all those individual look-ups by incorporating the logic into the main query, using OUTER JOIN syntax to test for existence.
These are all guesses. If you really want to know where the procedure is spending the time - and that knowledge is the root of all successful tuning, so you ought to want to know - you should run the procedure under a PL/SQL Profiler. This will tell you which lines cost the most time, and those are usually the ones where you need to focus your tuning effort. If you don't already have access to DBMS_PROFILER you will need a DBA to run the install script for you. Find out more.
" can we change this code into Java and run this code in multi threaded mode?"
Given that one of the reasons for slowing down the procedure is the I/O cost of selecting from the temporary table there's a good chance multi-threading might introduce further contention and actually make things worse. You should seek to improve the stored procedure first.

Questions related to efficient DML in procedure/functions

I have 2 questions regarding performance of PL/SQL script when executing DML. Ofcourse the EXECUTE IMMEDIATE is the slowest one thats why we have forall, bulk insert etc. My Questions are
I have to manipulate data in 3 different tables. Table1 (insert data), Table2(update data) and Table3 delete data. All of these would be done based on the values fetched using a cursor. the question is what would be more efficient here?
Putting each of these statements in individual Forall block? i.e.
fetch cursor
loop
forall loop for table 1
forall loop for table 2
forall loop for table 3
end loop
OR
a global loop and execute these statments in that loop i.e.
fetch cursor loop
for i IN array.count
loop
3 statements for DML
end loop end loop
Now my second question
what is the efficient way to delete records in loop? I fetched the values of the records to be deleted through the cursor. now what would be the efficient way to delete them?
P.S:
Execuse my formatting
The most efficient approach would be to write three SQL statements, assuming the data fetched from the cursor is stable over the period of time that the procedure is running
INSERT INTO table1( list_of_columns )
<<your SELECT statement>>
UPDATE table2
SET (<<list of columns>>) = (<<your SELECT statement joined to table2>>)
WHERE EXISTS( <<your SELECT statement joined to table2>> );
DELETE FROM table3
WHERE EXISTS( <<your SELECT statement joined to table3>> );
If the SELECT statement will potentially return different results in each of the three DML statements, then it makes sense to accept the performance hit of using a cursor, bulk collecting the data into PL/SQL collections, and looping over the collections in order to ensure consistent results. If that's what you're doing, it will be more efficient to have three FORALL statements since that involves fewer context shifts between the SQL and PL/SQL engines.
What is the efficient way to delete records in loop? I fetched the values of the records to be deleted through the cursor. now what would be the efficient way to delete them?
I'm not sure I understand the question. Wouldn't you just do a FORALL loop just as you would for an INSERT or an UPDATE
FORALL i IN l_array.first .. l_array.last
DELETE FROM some_table
WHERE some_key = l_array(i);
Or are you asking a different question?

bulk collect in oracle

How to query bulk collection? If for example I have
select name
bulk collect into namesValues
from table1
where namesValues is dbms_sql.varchar2_table.
Now, I have another table XYZ which contains
name is_valid
v
h
I want to update is_valid to 'Y' if name is in table1 else 'N'. Table1 has 10 million rows. After bulk collecting I want to execute
update xyz
set is_valid ='Y'
where name in namesValue.
How to query namesValue? Or is there is another option. Table1 has no index.
please help.
As Tom Kyte (Oracle Corp. Vice President) says:
My mantra, that I'll be sticking with thank you very much, is:
You should do it in a single SQL statement if at all possible.
If you cannot do it in a single SQL Statement, then do it in PL/SQL.
If you cannot do it in PL/SQL, try a Java Stored Procedure.
If you cannot do it in Java, do it in a C external procedure.
If you cannot do it in a C external routine, you might want to
seriously think about why it is you need to do it…
think in sets...
learn all there is to learn about SQL...
You should perform your update in SQL if you can. If you need to add an index to do this then that might be preferable to looping through a collection populated with BULK COLLECT.
If however, this is some sort of assignment....
You should specify it as such but here's how you would do it.
I have assumed that your DB server does not have the capacity to hold 10 million records in memory so rather than BULK COLLECTing all 10 million records in one go I have put the BULK COLLECT into a loop to reduce your memory overheads. If this is not the case then you can omit the bulk collect loop.
DECLARE
c_bulk_limit CONSTANT PLS_INTEGER := 500000;
--
CURSOR names_cur
IS
SELECT name
FROM table1;
--
TYPE namesValuesType IS TABLE OF table1.name%TYPE
INDEX BY PLS_INTEGER;
namesValues namesValuesType;
BEGIN
-- Populate the collection
OPEN name_cur;
LOOP
-- Fetch the records in a loop limiting them
-- to the c_bulk_limit amount at a time
FETCH name_cur BULK COLLECT INTO namesValues
LIMIT c_bulk_limit;
-- Process the records in your collection
FORALL x IN INDICES OF namesValues
UPDATE xyz
SET is_valid ='Y'
WHERE name = namesValue(x)
AND is_valid != 'Y';
-- Set up loop exit criteria
EXIT WHEN namesValues.COUNT < c_bulk_limit;
END LOOP;
CLOSE name_cur;
-- You want to update all remaining rows to 'N'
UPDATE xyz
SET is_valid ='N'
WHERE is_valid IS NULL;
EXCEPTION
WHEN others
THEN
IF name_cur%ISOPEN
THEN
CLOSE name_cur;
END IF;
-- Re-raise the exception;
RAISE;
END;
/
Depending upon your rollback segment sizes etc. you may want to issue interim commits within the bulk collect loop but be aware that you will not then be able to rollback these changes. I deliberately haven't added any COMMITs to this so you can choose where to put them to suit your system.
You also might want to change the size of the c_bulk_limit constant depending upon the resources available to you.
Your update will still cause you problems if the xyz table is large and there is no index on the name column.
Hope it helps...
"Table1 has no index."
Well there's your problem right there. Why not? Put an index on TABLE1.NAME and use a normal SQL UPDATE to amend the data in XYZ.
Trying to solve this problem with bulk collect is not the proper approach.

Performance problem with Oracle BULK FETCH and FORALL insert

I am trying to copied records from one table to another as fast as possible.
Currently I have a simple cursor loop similiar to this:
FOR rec IN source_cursor LOOP
INSERT INTO destination (a, b) VALUES (rec.a, rec.b)
END LOOP;
I want to speed it up to be super fast so am trying some BULK operations (a BULK FETCH, then a FORALL insert):
Here is what I have for the bulk select / forall insert.
DECLARE
TYPE t__event_rows IS TABLE OF _event%ROWTYPE;
v__event_rows t__event_rows;
CURSOR c__events IS
SELECT * FROM _EVENT ORDER BY MESSAGE_ID;
BEGIN
OPEN c__events;
LOOP
FETCH c__events BULK COLLECT INTO v__event_rows LIMIT 10000; -- limit to 10k to avoid out of memory
EXIT WHEN c__events%NOTFOUND;
FORALL i IN 1..v__event_rows.COUNT SAVE EXCEPTIONS
INSERT INTO destinatoin
( col1, col2, a_sequence)
VALUES
( v__event_rows(i).col1, v__event_rows(i).col2, SOMESEQEUENCE.NEXTVAL );
END LOOP;
CLOSE c__events;
END;
My problem is that I'm not seeing any big gains in performance so far. From what I read it should be 10x-100x faster.
Am I missing a bottleneck here somewhere?
The only benefit your code has over a simple INSERT+SELECT is that you save exceptions, plus (as Justin points out) you have a pointless ORDER BY which is making it do a whole lot of meaningless work. You then don't have any code to do anything with the exceptions that were saved, anyway.
I'd just implement it as a INSERT+SELECT.
You donot have to use loops unnecessarily until it is required in the coding itself.

PL/SQL: Best practice for fetching 2 or more joined tables from a cursor?

I've heard that it's a good practice to define your records in PL/SQL by using the %ROWTYPE attribute. This saves typing and allows your package to continue functioning even when a column is added or deleted. (Correct me if I'm wrong!)
However, when I am fetching from a cursor that involves a join, I find that I have to fetch into a programmer-defined record that includes a (quite-possibly long) hand-written list of every column returned by the join.
So my question is:
Is it possible to fetch into nested records, or fetch into a list of records, or do something to avoid such an ugly kludge? Everything I've tried leads to an error about the record not matching what's being returned by the cursor.
Returning the result of a join using a cursor seems like such a common use-case to me that it's strange that nothing related to this comes up in a search.
Thank you.
You can user cursor%rowtype.
Sample:
declare
cursor c_c is
select emp.*, dept.* -- use aliasses if columns have same name
from emp
, dept; -- for sample no join condition
r_c c_c%rowtype;
begin
for r_c in c_c loop -- with for loop even the definition of r_c is not needed.
...
end loop;
end;
/
Why even bother with the cursor declaration?
This is equivalent.
begin
for r_c in (select emp.*, dept.* from emp, dept) loop
...
end loop;
end;
I see in your comment you mention this. But I see the explicit cursor syntax used so much, i think it's important to show.

Resources