Oracle: Bulk Collect performance - oracle

Can you help me to understand this phrase?
Without the bulk bind, PL/SQL sends a SQL statement to the SQL engine
for each record that is inserted, updated, or deleted leading to
context switches that hurt performance.

Within Oracle, there is a SQL virtual machine (VM) and a PL/SQL VM. When you need to move from one VM to the other VM, you incur the cost of a context shift. Individually, those context shifts are relatively quick, but when you're doing row-by-row processing, they can add up to account for a significant fraction of the time your code is spending. When you use bulk binds, you move multiple rows of data from one VM to the other with a single context shift, significantly reducing the number of context shifts, making your code faster.
Take, for example, an explicit cursor. If I write something like this
DECLARE
CURSOR c
IS SELECT *
FROM source_table;
l_rec source_table%rowtype;
BEGIN
OPEN c;
LOOP
FETCH c INTO l_rec;
EXIT WHEN c%notfound;
INSERT INTO dest_table( col1, col2, ... , colN )
VALUES( l_rec.col1, l_rec.col2, ... , l_rec.colN );
END LOOP;
END;
then every time I execute the fetch, I am
Performing a context shift from the PL/SQL VM to the SQL VM
Asking the SQL VM to execute the cursor to generate the next row of data
Performing another context shift from the SQL VM back to the PL/SQL VM to return my single row of data
And every time I insert a row, I'm doing the same thing. I am incurring the cost of a context shift to ship one row of data from the PL/SQL VM to the SQL VM, asking the SQL to execute the INSERT statement, and then incurring the cost of another context shift back to PL/SQL.
If source_table has 1 million rows, that's 4 million context shifts which will likely account for a reasonable fraction of the elapsed time of my code. If, on the other hand, I do a BULK COLLECT with a LIMIT of 100, I can eliminate 99% of my context shifts by retrieving 100 rows of data from the SQL VM into a collection in PL/SQL every time I incur the cost of a context shift and inserting 100 rows into the destination table every time I incur a context shift there.
If can rewrite my code to make use of bulk operations
DECLARE
CURSOR c
IS SELECT *
FROM source_table;
TYPE nt_type IS TABLE OF source_table%rowtype;
l_arr nt_type;
BEGIN
OPEN c;
LOOP
FETCH c BULK COLLECT INTO l_arr LIMIT 100;
EXIT WHEN l_arr.count = 0;
FORALL i IN 1 .. l_arr.count
INSERT INTO dest_table( col1, col2, ... , colN )
VALUES( l_arr(i).col1, l_arr(i).col2, ... , l_arr(i).colN );
END LOOP;
END;
Now, every time I execute the fetch, I retrieve 100 rows of data into my collection with a single set of context shifts. And every time I do my FORALL insert, I am inserting 100 rows with a single set of context shifts. If source_table has 1 million rows, this means that I've gone from 4 million context shifts to 40,000 context shifts. If context shifts accounted for, say, 20% of the elapsed time of my code, I've eliminated 19.8% of the elapsed time.
You can increase the size of the LIMIT to further reduce the number of context shifts but you quickly hit the law of diminishing returns. If you used a LIMIT of 1000 rather than 100, you'd eliminate 99.9% of the context shifts rather than 99%. That would mean that your collection was using 10x more PGA memory, however. And it would only eliminate 0.18% more elapsed time in our hypothetical example. You very quickly reach a point where the additional memory you're using adds more time than you save by eliminating additional context shifts. In general, a LIMIT somewhere between 100 and 1000 is likely to be the sweet spot.
Of course, in this example, it would be more efficient still to eliminate all context shifts and do everything in a single SQL statement
INSERT INTO dest_table( col1, col2, ... , colN )
SELECT col1, col2, ... , colN
FROM source_table;
It would only make sense to resort to PL/SQL in the first place if you're doing some sort of manipulation of the data from the source table that you can't reasonably implement in SQL.
Additionally, I used an explicit cursor in my example intentionally. If you are using implicit cursors, in recent versions of Oracle, you get the benefits of a BULK COLLECT with a LIMIT of 100 implicitly. There is another StackOverflow question that discusses the relative performance benefits of implicit and explicit cursors with bulk operations that goes into more detail about those particular wrinkles.

AS I understand this, there are two engine involved, PL/SQL engine and SQL Engine. Executing a query that make use of one engine at a time is more efficient than switching between the two
Example:
INSERT INTO t VALUES(1)
is processed by SQL engine while
FOR Lcntr IN 1..20
END LOOP
is executed by PL/SQL engine
If you combine the two statement above, putting INSERT in the loop,
FOR Lcntr IN 1..20
INSERT INTO t VALUES(1)
END LOOP
Oracle will be switching between the two engines, for the each (20) iterations.
In this case BULK INSERT is recommended which makes use of PL/SQL engine all through the execution

Related

Explicit cursors using bulk collect vs. implicit cursors: any performance issues?

In an older article from Oracle Magazine (now online as On Cursor FOR Loops) Steven Feuerstein showed an optimization for explicit cursor for loops using bulk collect (listing 4 in the online article):
DECLARE
CURSOR employees_cur is SELECT * FROM employees;
TYPE employee_tt IS TABLE OF employees_cur%ROWTYPE INDEX BY PLS_INTEGER;
l_employees employee_tt;
BEGIN
OPEN employees_cur;
LOOP
FETCH employees_cur BULK COLLECT INTO l_employees LIMIT 100;
-- process l_employees using pl/sql only
EXIT WHEN employees_cur%NOTFOUND;
END LOOP;
CLOSE employees_cur;
END;
I understand that bulk collect enhances the performance because there are less context switches between SQL and PL/SQL.
My question is about implicit cursor for loops:
BEGIN
FOR S in (SELECT * FROM employees)
LOOP
-- process current record of S
END LOOP;
END;
Is there a context switch in each loop for each record? Is the problem the same as with explicit cursors or is it somehow optimized "behind the scene"? Would it be better to rewrite the code using explicit cursors with bulk collect?
Starting from Oracle 10g the optimizing PL/SQL compiler can automatically convert FOR LOOPs into BULK COLLECT loops with a default array size of 100.
So generally there's no need to convert implicit FOR loops into BULK COLLECT loops.
But sometimes you may want to use BULK COLLECT instead. For example, if the default array size of 100 rows per fetch does not satisfy your requirements OR if you prefer to update your data within a set.
The same question was answered by Tom Kyte. You can check it here: Cursor FOR loops optimization in 10g
Yes, even if your -- process current record of S contains pure SQL and no PL/SQL you have context switch as the FOR ... LOOP is PL/SQL but the query is SQL.
Whenever possible you should prefer to process your data with single SQL statements (consider also MERGE, not only DELETE, UPDATE, INSERT), in most cases they are faster than a row-by-row processing.
Note, you will not gain any performance if you make a loop through l_employees and perform DLL for each record.
LIMIT 100 is rather useless. Processing only 100 rows at once would be almost the same as processing rows one-by-one - Oracle does not run on Z80 with 64K Memory.

PL/SQL Error: A loop that contains DML statements should be refactored to use BULK COLLECT and FORALL

I've searched across the whole internet for some examples, but I still can't get my head around why I can not use DML statement inside this cursor. I'm kind of missing the theory behind it, but I won't deny an example of how to do write this correctly would make my life lots easier as well. Here is the query I'm working on (Note: I removed exits when not found results, close if cursor already open and things like that just to focus on the main point here):
DECLARE
// lots of vars
// the cursor below gets all datasources connected to Node XXYZ123
CURSOR DataSourceCheck
IS
SELECT NODENAAM, NAAM, URL, DBNODE1, DBNODE2, DBUSERNAAM, DBNAAM
FROM SCHEMA.TABLENAME
WHERE NODENAAM = 'XXYZ123';
// this cursor will execute row-by-row based on the result set of above cursor
CURSOR CheckIfOnlyDataSource
IS
SELECT NODENAAM, NAAM, URL, DBNODE1, DBNODE2, DBUSERNAAM, DBNAAM
FROM SCHEMA.TABLENAME
WHERE DBUSERNAAM = var_dbusernaam AND (DBNode1 = var_dbnode1 OR DBNode2 = var_dbnode2);
BEGIN
OPEN DataSourceCheck;
LOOP
FETCH DataSourceCheck into var_nodenaam, var_naam, var_URL, var_dbnode1, var_dbnode2, var_dbusernaam, var_dbnaam;
var_rowcount:= 0;
OPEN CheckIfOnlyDataSource;
LOOP
FETCH CheckIfOnlyDataSource into var_nodenaam2, var_naam2, var_URL2, var_dbnode12, var_dbnode22, var_dbusernaam2, var_dbnaam2;
var_rowcount:= var_rowcount + 1;
END LOOP;
// only save result in a temp table when var_rowcount is 1 and not higher.
IF var_rowcount = 1
THEN
INSERT INTO global_temp_table
(t_dbusernaam, t_nodenaam, t_dbnode1, t_dbnode2, t_distinctcount)
VALUES
(var_dbusernaam2, var_nodenaam2, var_dbnode12, var_dbnode22, var_rowcount)
END IF;
CLOSE CheckIfOnlyDataSource;
END LOOP;
END;
The point of failure is this part, with the message that DML should be reconfigured into FORALL or BULK INTO statements:
IF var_rowcount = 1
THEN
INSERT INTO global_temp_table
(t_dbusernaam, t_nodenaam, t_dbnode1, t_dbnode2, t_distinctcount)
VALUES
(var_dbusernaam2, var_nodenaam2, var_dbnode12, var_dbnode22, var_rowcount)
END IF;
I don't understand why DML is not working in a row-by-row approach? The output is clearly stored inside the variables var_dbusernaam2, var_nodenaam2, var_dbnode12 and var_dbnode22, hence I can do a dbms_output.put_line to show them. But if it is stored into the variable already, then why can't I just store it simply into a table (this isn't billions of bulk data, not even 1000 records!).
Is there no simple workaround? I gave BULK COLLECT and FORALL a try, but I need a lot more time to invest to understand it and get the query right - the cursor in a cursor definately won't make it any easier.
In addition to the suggestion in Mottor's answer, the reason why Toad is flagging up your code is because row-by-row processing is slow. You've got a lot of context switching going on between the PL/SQL and SQL engines.
Think of it like building new wall near your house - if the bricks are delivered to the bottom of the drive, do you:
Go to the pile of bricks
Pick up a single brick
Go back to your wall
Add the brick onto the wall
Go back to step 1 and repeat until the wall is complete
(This is the equivalent of row-by-row processing)
Or:
Take your wheelbarrow down to the pile of bricks
Load your wheelbarrow with as many bricks as will fit and/or you can carry
Take the wheelbarrow back over to the wall
Add each brick into the wall
Go back to step 1 and repeat until the wall is complete.
(This is the equivalent of bulk processing.)
Of course, if you're canny, you could avoid all the walking and carrying required in the above scenarios by getting the bricks delivered right next to the wall in the first place. (This is the equivalent of set-based processing).
Turning your procedure into a set-based approach (incorporating Mottor's answer) would make it simply:
declare
-- lots of vars
begin
insert into global_temp_table (t_dbusernaam,
t_nodenaam,
t_dbnode1,
t_dbnode2,
t_distinctcount)
select dbusernaam,
nodenaam,
dbnode1,
dbnode2,
cnt
from (select nodenaam,
naam,
url,
dbnode1,
dbnode2,
dbusernaam,
dbnaam,
count(*) over (partition by dbnode1, dbnode2, dbusernaam) cnt
from schema.tablename
where nodenaam = 'XXYZ123')
where cnt = 1;
end;
/
This has the advantage of being more compact than your original code, making it easier to read, understand and therefore debug. Plus you can run the select statement on its own outside of the procedure - much easier to see what it's doing that way.
It will also be faster than your original approach of looping through two cursors (which, by the way, was reinventing the nested loop join - something that the database is optimised to do in pure SQL... and may not be the fastest way of doing the join anyway, if you had been stuck with keeping the join!).
I'd also be interested to know why you need to insert the rows into the GLOBAL_TEMP_TABLE (which I suspect is a GTT - global temporary table - rather than a normal heap table) - can you not do the subsequent processing in a single SQL statement, using the above select statement rather than inserting the data into the GTT?
This is not an error, but the TOAD suggestion with number Rule 4809.
P.S. If the table is the same in the both query, you can use
..., COUNT(*) OVER (PARTITION BY DBNODE1, DBNODE2, DBUSERNAAM) c
in the first query to get the number of rows per DBNODE1, DBNODE2, DBUSERNAAM and not to need the second one.

Oracle insert 1000 rows at a time

I would like to insert 1000 rows at a time with oracle
Example:
INSERT INTO MSG(AUTHOR)
SELECT AUTHOR FROM oldDB.MSGLOG
This insert is taking a very long time but if I limit it with ROWNUM <= 1000 it will insert right away so I want to create an import that goes throuhg my X number of rows and inserts 1000 at at time.
Thanks
It is rather doubtful that this will really improve performance particularly given the simplicity of the SELECT statement. That must be doing either a full scan of the table or of an index on author. If that scan is slow, you're much better off diagnosing the underlying problem rather than trying to work around it (for example, perhaps oldDB.MsgLog has a number of empty blocks below the high water mark that forces a full table scan to read many more blocks than is strictly necessary).
If you really want to write some more verbose and less efficient PL/SQL to accomplish the task, though, you certainly can
DECLARE
TYPE tbl_authors IS TABLE OF msg.author%TYPE;
l_authors tbl_authors;
CURSOR author_cursor
IS SELECT author
FROM oldDB.MsgLog;
BEGIN
OPEN author_cursor;
LOOP
FETCH author_cursor
BULK COLLECT INTO l_authors
LIMIT 1000;
EXIT WHEN l_authors.count = 0;
FORALL i IN 1..l_authors.count
INSERT INTO msg( author )
VALUES( l_authors(i) );
END LOOP;
END;

Why Oracle BULK DML operations are faster [duplicate]

Can you help me to understand this phrase?
Without the bulk bind, PL/SQL sends a SQL statement to the SQL engine
for each record that is inserted, updated, or deleted leading to
context switches that hurt performance.
Within Oracle, there is a SQL virtual machine (VM) and a PL/SQL VM. When you need to move from one VM to the other VM, you incur the cost of a context shift. Individually, those context shifts are relatively quick, but when you're doing row-by-row processing, they can add up to account for a significant fraction of the time your code is spending. When you use bulk binds, you move multiple rows of data from one VM to the other with a single context shift, significantly reducing the number of context shifts, making your code faster.
Take, for example, an explicit cursor. If I write something like this
DECLARE
CURSOR c
IS SELECT *
FROM source_table;
l_rec source_table%rowtype;
BEGIN
OPEN c;
LOOP
FETCH c INTO l_rec;
EXIT WHEN c%notfound;
INSERT INTO dest_table( col1, col2, ... , colN )
VALUES( l_rec.col1, l_rec.col2, ... , l_rec.colN );
END LOOP;
END;
then every time I execute the fetch, I am
Performing a context shift from the PL/SQL VM to the SQL VM
Asking the SQL VM to execute the cursor to generate the next row of data
Performing another context shift from the SQL VM back to the PL/SQL VM to return my single row of data
And every time I insert a row, I'm doing the same thing. I am incurring the cost of a context shift to ship one row of data from the PL/SQL VM to the SQL VM, asking the SQL to execute the INSERT statement, and then incurring the cost of another context shift back to PL/SQL.
If source_table has 1 million rows, that's 4 million context shifts which will likely account for a reasonable fraction of the elapsed time of my code. If, on the other hand, I do a BULK COLLECT with a LIMIT of 100, I can eliminate 99% of my context shifts by retrieving 100 rows of data from the SQL VM into a collection in PL/SQL every time I incur the cost of a context shift and inserting 100 rows into the destination table every time I incur a context shift there.
If can rewrite my code to make use of bulk operations
DECLARE
CURSOR c
IS SELECT *
FROM source_table;
TYPE nt_type IS TABLE OF source_table%rowtype;
l_arr nt_type;
BEGIN
OPEN c;
LOOP
FETCH c BULK COLLECT INTO l_arr LIMIT 100;
EXIT WHEN l_arr.count = 0;
FORALL i IN 1 .. l_arr.count
INSERT INTO dest_table( col1, col2, ... , colN )
VALUES( l_arr(i).col1, l_arr(i).col2, ... , l_arr(i).colN );
END LOOP;
END;
Now, every time I execute the fetch, I retrieve 100 rows of data into my collection with a single set of context shifts. And every time I do my FORALL insert, I am inserting 100 rows with a single set of context shifts. If source_table has 1 million rows, this means that I've gone from 4 million context shifts to 40,000 context shifts. If context shifts accounted for, say, 20% of the elapsed time of my code, I've eliminated 19.8% of the elapsed time.
You can increase the size of the LIMIT to further reduce the number of context shifts but you quickly hit the law of diminishing returns. If you used a LIMIT of 1000 rather than 100, you'd eliminate 99.9% of the context shifts rather than 99%. That would mean that your collection was using 10x more PGA memory, however. And it would only eliminate 0.18% more elapsed time in our hypothetical example. You very quickly reach a point where the additional memory you're using adds more time than you save by eliminating additional context shifts. In general, a LIMIT somewhere between 100 and 1000 is likely to be the sweet spot.
Of course, in this example, it would be more efficient still to eliminate all context shifts and do everything in a single SQL statement
INSERT INTO dest_table( col1, col2, ... , colN )
SELECT col1, col2, ... , colN
FROM source_table;
It would only make sense to resort to PL/SQL in the first place if you're doing some sort of manipulation of the data from the source table that you can't reasonably implement in SQL.
Additionally, I used an explicit cursor in my example intentionally. If you are using implicit cursors, in recent versions of Oracle, you get the benefits of a BULK COLLECT with a LIMIT of 100 implicitly. There is another StackOverflow question that discusses the relative performance benefits of implicit and explicit cursors with bulk operations that goes into more detail about those particular wrinkles.
AS I understand this, there are two engine involved, PL/SQL engine and SQL Engine. Executing a query that make use of one engine at a time is more efficient than switching between the two
Example:
INSERT INTO t VALUES(1)
is processed by SQL engine while
FOR Lcntr IN 1..20
END LOOP
is executed by PL/SQL engine
If you combine the two statement above, putting INSERT in the loop,
FOR Lcntr IN 1..20
INSERT INTO t VALUES(1)
END LOOP
Oracle will be switching between the two engines, for the each (20) iterations.
In this case BULK INSERT is recommended which makes use of PL/SQL engine all through the execution

Optimize Oracle 11g Procedure

I have a procedure to find the first, last, max and min prices for a series of transactions in a very large table which is organized by date, object name, and a code. I also need the sum of quantities transacted. There are about 3 billion rows in the table and this procedure takes many days to run. I would like to cut that time down as much as possible. I have an index on the distinct fields in the trans table, and looking at the explain plan on the select portion of the queries, the index is being used. I am open to suggestions on an alternate approach. I use Oracle 11g R2. Thank you.
declare
cursor c_iter is select distinct dt, obj, cd from trans;
r_iter c_iter%ROWTYPE;
v_fir number(15,8);
v_las number(15,8);
v_max number(15,8);
v_min number(15,8);
v_tot number;
begin
open c_iter;
loop
fetch c_iter into r_iter;
exit when c_iter%NOTFOUND;
select max(fir), max(las) into v_fir, v_las
from
( select
first_value(prc) over (order by seq) as "FIR",
first_value(prc) over (order by seq desc) as "LAS"
from trans
where dt = r_iter.DT and obj = r_iter.OBJ and cd = r_iter.CD );
select max(prc), min(prc), sum(qty) into v_max, v_min, v_tot
from trans
where dt = r_iter.DT and obj = r_iter.OBJ and cd = r_iter.CD;
insert into stats (obj, dt, cd, fir, las, max, min, tot )
values (r_iter.OBJ, r_iter.DT, r_iter.CD, v_fir, v_las, v_max, v_min, v_tot);
commit;
end loop;
close c_iter;
end;
alter session enable parallel dml;
insert /*+ append parallel(stats)*/
into stats(obj, dt, cd, fir, las, max, min, tot)
select /*+ parallel(trans) */ obj, dt, cd
,max(prc) keep (dense_rank first order by seq) fir
,max(prc) keep (dense_rank first order by seq desc) las
,max(prc) max, min(prc) min, sum(qty) tot
from trans
group by obj, dt, cd;
commit;
A single SQL statement is usually significantly faster than multiple SQL statements. They sometimes require more resources, like more temporary tablespace, but your distinct cursor is probably already sorting the entire table on disk anyway.
You may want to also enable parallel DML and parallel query, although depending on your object and system settings this may already be happening. (And it may not necessarily be a good thing, depending on your resources, but it usually helps large queries.)
Parallel write and APPEND should improve performance if the SQL writes a lot of data, but it also means that the new table will not be recoverable until the next backup. (Parallel DML will automatically use direct path writes, but I usually include APPEND anyway just in case the parallelism doesn't work correctly.)
There's a lot to consider, even for such a small query, but this is where I'd start.
Not the solid answer I'd like to give, but a few things to consider:
The first would be using a bulk collect. However, since you're using 11g, hopefully this is already being done for you automatically.
Do you really need to commit after every single iteration? I could be wrong, but am guessing this is one of your top time consumers.
Finally, +1 for jonearles' answer. (I wasn't sure if I'd be able to write everything into a single SQL query, but I was going to suggest this as well.)
You could try and make the query run in parallel, there is a reasonable Oracle White Paper on this here. This isn't an Oracle feature that I've ever had to use myself so I've no first hand experience of it to pass on. You will also need to have enough resources free on the Oracle server to allow you run the parallel processes that this will create.

Resources