So I have this project I'm working on at work and I've noticed a lot of people using a the INSERT INTO SELECT method:
INSERT INTO candy_tbl (candy_name,
candy_type,
candy_qty)
SELECT food_name,
food_type,
food_qty
FROM food_tbl WHERE food_type = 'C';
However, I use the following cursor method:
FOR rec IN ( SELECT
food_name,
food_type,
food_qty
FROM food_tbl WHERE food_type = 'C')
LOOP INSERT INTO candy_tbl(candy_name,
candy_type,
candy_qty)
VALUES(rec.food_name,
rec.food_type,
rec.food_qty)
END LOOP;
This will be going into a PL/SQL package. My question is, which is usually the 'preferred' method and when? I usually choose the cursor method because it gives me a little more flexibility with exception handling. But, I could see how it might be a performance issue when inserting a whole lot of records.
The FOR LOOP will require a fetch for each row from the CURSOR. The INSERT in the loop will happen 1 by 1. PLSQL runs in a PLSQL engine and SQL runs in a SQL engine, so the FOR LOOP:
- runs in the PLSQL engine
- sends the query to the SQL engine to execute the query and open a cursor then switches back to the PLSQL engine
- each loop does a FETCH from the CURSOR then does an INSERT meaning back to the SQL engine then returning to the PLSQL engine
each switch between SQL and PLSQL as well as each FETCH is expensive.
The INSERT INTO SELECT will be sent to the SQL engine once and run there until done and then back to PLSQL.
Other advantages exist, but that is the main PLSQL difference between the 2 methods.
The first one is faster since it's basically a single transaction aka set-based-processing.
The latter operates by row, for a very large table there will be a large difference in performance.
If you really need the flexibility of cursor processing but better performance there is a third intermediate option available - BULK COLLECT and FORALL with save exceptions option. However, the trade off is much more code complexity. The following is the basic structure.
declare
exception error_in_forall ;
pragma exception_init (error_in_forall, -24381);
cursor c_select is ( select ... ) ;
type c_array_type table of c_select%rowtype;
v_select_data c_array_type ;
begin
open c_select;
loop
fetch c_select
bulk collect
into v_select_data;
forall rdata in v_select_data.first .. v_select_data.last save exceptions
insert into ( ... ) values (v_select_data(rdata).column ... ) ;
exceptions
when error_in_forall then
<Process Oracle generated bulk error collection >
end ;
When complete if any errors occurred during execution of the Insert then the exceptions fires once. Oracle having built a SQL%BULK_EXCEPTIONS collection containing the index value and the error code of each. See the PL/SQL Language Reference for your version for details.
Related
How many context switches will happen for the below given plsql block
Declare
ll_row_count number := 0;
begin
for i in (select * from employee)
loop
ll_row_count := ll_row_count+1;
update employee
set emp_name = upper(emp_name)
where emp_id = i.emp_id;
commit;
end loop;
dbms_output.put_line('Total rows updated' || ll_row_count);
end;
/
Context switches can happen for many reasons, including multitasking and interrupts. When speaking about Oracle database development, we generally mean only the switches between the PL/SQL engine and the SQL engine.
In your example, PL/SQL is calling SQL. There is one context switch when PL/SQL calls SQL and a second one when SQL returns to PL/SQL.
PL/SQL calls SQL to PARSE a statement, to EXECUTE a statement or to FETCH rows from a query. We can measure the number of calls using the SQL trace facility and the trace profiler called TKPROF.
I created a table with 199 rows and traced the execution of your code:
There were 3 PARSE calls: 1 for the SELECT, 1 for the UPDATE and 1 for the COMMIT.
There were 2 FETCH calls. When you code for i in (select * from employee) loop, PL/SQL will automatically fetch 100 rows at a time (in order to reduce the number of context switches).
There were 399 EXECUTE calls: 1 for the SELECT, 199 for the UPDATE and 199 for the COMMIT.
So there were 404 calls from PL/SQL to SQL, and 404 returns from SQL to PL/SQL, making 808 context switches in all.
We can cut the number of context switches in half by committing once after the loop. It is strongly recommended to avoid too frequent commits. If you commit within a SELECT loop, you can get an exception related to UNDO no longer being available.
Generally, the best way to reduce context switches and enhance performance is to use set-based SQL. Failing that, we can process a bunch of rows at a time using BULK COLLECT and FORALL.
Best regards,
Stew Ashton
Context switching
While executing any code block or query, if executing engine needs to fetch data from other engine then it is called context switching. Here engine refers to the SQL engine and PL/SQL engine.
Means, While executing your code in PL/SQL, If there comes a SQL statement then PL/SQL engine need to pass this SQL statement to SQL engine, SQL engine fetches the result and passes it back to PL/SQL engine. Hence, Two context switch happens.
Now, Comming to your block, please see inline comments. We will use N as a number of record in table employee
Declare
ll_row_count number := 0;
begin
for i in (select * from employee) -- (CEIL(N/100)*2) context switch
loop
ll_row_count := ll_row_count+1;
update employee
set emp_name = upper(emp_name)
where emp_id = i.emp_id; -- 2*N context switch
commit; -- 2*N context switch
end loop;
dbms_output.put_line('Total rows updated' || ll_row_count);
end;
/
Now, Why we divide N by 100?
Because In oracle 10g and above For loop is optimized to use bulk transaction of LIMIT 100 to reduce context switching in the loop.
So finally, the number of context switch are: (CEIL(N/100)*2) + 2*N + 2*N
Cheers!!
In an older article from Oracle Magazine (now online as On Cursor FOR Loops) Steven Feuerstein showed an optimization for explicit cursor for loops using bulk collect (listing 4 in the online article):
DECLARE
CURSOR employees_cur is SELECT * FROM employees;
TYPE employee_tt IS TABLE OF employees_cur%ROWTYPE INDEX BY PLS_INTEGER;
l_employees employee_tt;
BEGIN
OPEN employees_cur;
LOOP
FETCH employees_cur BULK COLLECT INTO l_employees LIMIT 100;
-- process l_employees using pl/sql only
EXIT WHEN employees_cur%NOTFOUND;
END LOOP;
CLOSE employees_cur;
END;
I understand that bulk collect enhances the performance because there are less context switches between SQL and PL/SQL.
My question is about implicit cursor for loops:
BEGIN
FOR S in (SELECT * FROM employees)
LOOP
-- process current record of S
END LOOP;
END;
Is there a context switch in each loop for each record? Is the problem the same as with explicit cursors or is it somehow optimized "behind the scene"? Would it be better to rewrite the code using explicit cursors with bulk collect?
Starting from Oracle 10g the optimizing PL/SQL compiler can automatically convert FOR LOOPs into BULK COLLECT loops with a default array size of 100.
So generally there's no need to convert implicit FOR loops into BULK COLLECT loops.
But sometimes you may want to use BULK COLLECT instead. For example, if the default array size of 100 rows per fetch does not satisfy your requirements OR if you prefer to update your data within a set.
The same question was answered by Tom Kyte. You can check it here: Cursor FOR loops optimization in 10g
Yes, even if your -- process current record of S contains pure SQL and no PL/SQL you have context switch as the FOR ... LOOP is PL/SQL but the query is SQL.
Whenever possible you should prefer to process your data with single SQL statements (consider also MERGE, not only DELETE, UPDATE, INSERT), in most cases they are faster than a row-by-row processing.
Note, you will not gain any performance if you make a loop through l_employees and perform DLL for each record.
LIMIT 100 is rather useless. Processing only 100 rows at once would be almost the same as processing rows one-by-one - Oracle does not run on Z80 with 64K Memory.
First, I want to make it clear that the question is not about the materialized views feature.
Suppose, I have a table function that returns a pre-defined set of columns.
When a function call is submitted as
SELECT col1, col2, col3
FROM TABLE(my_tfn(:p1))
WHERE col4 = 'X';
I can evaluate the parameter and choose what queries to execute.
I can either open one of the pre-defined cursors, or I can assemble my query dynamically.
What if instead of evaluating the parameter I want to evaluate the text of the requesting query?
For example, if my function returns 20 columns but the query is only requesting 4,
I can assign NULLs to remaining 16 clumns of the return type, and execute fewer joins.
Or I can push the filter down to my dynamic query.
Is there a way to make this happen?
More generally, is there a way to look at the requesting query before exuting the function?
There is no robust way to identify the SQL that called a PL/SQL object.
Below is a not-so-robust way to identify the calling SQL. I've used code like this before, but only in special circumstances where I knew that the PL/SQL would never run concurrently.
This seems like it should be so simple. The data dictionary tracks all sessions and running SQL. You can find the current session with sys_context('userenv', 'sid'), match that to GV$SESSION, and then get either SQL_ID and PREV_SQL_ID. But neither of those contain the calling SQL. There's even a CURRENT_SQL in SYS_CONTEXT, but it's only for fine-grained auditing.
Instead, the calling SQL must be found by a string search. Using a unique name for the PL/SQL object will help filter out unrelated statements. To prevent re-running for old statements, the SQL must be individually purged from the shared pool as soon as it is found. This could lead to race conditions so this approach will only work if it's never called concurrently.
--Create simple test type for function.
create or replace type clob_table is table of clob;
--Table function that returns the SQL that called it.
--This requires elevated privileges to run.
--To simplify the code, run this as SYS:
-- "grant execute on sys.dbms_shared_pool to your_user;"
--(If you don't want to do that, convert this to invoker's rights and use dynamic SQL.)
create or replace function my_tfn return clob_table is
v_my_type clob_table;
type string_table is table of varchar2(4000);
v_addresses string_table;
v_hash_values string_table;
begin
--Get calling SQL based on the SQL text.
select sql_fulltext, address, hash_value
bulk collect into v_my_type, v_addresses, v_hash_values
from gv$sql
--Make sure there is something unique in the query.
where sql_fulltext like '%my_tfn%'
--But don't include this query!
--(Normally creating a quine is a challenge, but in V$SQL it's more of
-- a challenge to avoid quines.)
and sql_fulltext not like '%quine%';
--Flush the SQL statements immediately, so they won't show up in next run.
for i in 1 .. v_addresses.count loop
sys.dbms_shared_pool.purge(v_addresses(i)||', '||v_hash_values(i), 'C');
end loop;
--Return the SQL statement(s).
return v_my_type;
end;
/
Now queries like these will return themselves, demonstrating that the PL/SQL code was reading the SQL that called it:
SELECT * FROM TABLE(my_tfn) where 1=1;
SELECT * FROM TABLE(my_tfn) where 2=2;
But even if you go through all this trouble - what are you going to do with the results? Parsing SQL is insanely difficult unless you can ensure that everyone always follows strict syntax rules.
I want to write pl/sql code which utilizes a Cursor and Bulk Collect to retrieve my data. My database has rows in the order of millions, and sometimes I have to query it to fetch nearly all records on client's request. I do the querying and subsequent processing in batches, so as to not congest the server and show incremental progress to the client. I have seen that digging down for later batches takes considerably more time, which is why I am trying to do it by way of cursor.
Here is what should be simple pl/sql around my main sql query:
declare
cursor device_row_cur
is
select /my_query_details/;
type l_device_rows is table of device_row_cur%rowtype;
out_entries l_device_rows := l_device_rows();
begin
open device_row_cur;
fetch device_row_cur
bulk collect into out_entries
limit 100;
close device_row_cur;
end;
I am doing batches of 100, and fetching them into out_entries. The problem is that this block compiles and executes just fine, but doesn't return the data rows it fetched. I would like it to return those rows just the way a select would. How can this be achieved? Any ideas?
An anonymous block can't return anything. You can assign values to a bind variable, including a collection type or ref cursor, inside the block. But the collection would have to be defined, as well as declared, outside the block. That is, it would have to be a type you can use in plain SQL, not something defined in PL/SQL. At the moment you're using a PL/SQL type that is defined within the block, and a variable that is declared within the block too - so it's out of scope to the client, and wouldn't be a valid type outside it either. (It also doesn't need to be initialised, but that's a minor issue).
Dpending on how it will really be consumed, one option is to use a ref cursor, and you can declare and display that through SQL*Plus or SQL Developer with the variable and print commands. For example:
variable rc sys_refcursor
begin
open :rc for ( select ... /* your cursor statement */ );
end;
/
print rc
You can do something similar from a client application, e.g. have a function returning a ref cursor or a procedure with an out parameter that is a ref cursor, and bind that from the application. Then iterate over the ref cursor as a result set. But the details depend on the language your application is using.
Another option is to have a pipelined function that returns a table type - again defined at SQL level (with create type) not in PL/SQL - which might consume fewer resources than a collection that's returned in one go.
But I'd have to question why you're doing this. You said "digging down for later batches takes considerably more time", which sounds like you're using a paging mechanism in your query, generating a row number and then picking out a range of 100 within that. If your client/application wants to get all the rows then it would be simpler to have a single query execution but fetch the result set in batches.
Unfortunately without any information about the application this is just speculation...
I studied this excellent paper on optimizing pagination:
http://www.inf.unideb.hu/~gabora/pagination/article/Gabor_Andras_pagination_article.pdf
I used technique 6 mainly. It describes how to limit query to fetch page x and onward. For added improvement, you can limit it further to fetch page x alone. If used right, it can bring a performance improvement by a factor of 1000.
Instead of returning custom table rows (which is very hard, if not impossible to interface with Java), I eneded up opening a sys_refcursor in my pl/sql which can be interfaced such as:
OracleCallableStatement stmt = (OracleCallableStatement) connection.prepareCall(sql);
stmt.registerOutParameter(someIndex, OracleTypes.CURSOR);
stmt.execute();
resultSet = stmt.getCursor(idx);
Here's a piece of Oracle code I'm trying to adapt. I've abbreviated all the details:
declare
begin
loop
--do stuff to populate a global temporary table. I'll call it 'TempTable'
end loop;
end;
/
Select * from TempTable
Right now, this query runs fine provided I run it in two steps. First I run the program at the top, then I run the select * to get the results.
Is it possible to combine the two pieces so that I can populate the global temp table and retrieve the results all in one step?
Thanks in advance!
Well, for me it depends on how I would see the steps. You are doing a PL/SQL and SQL command. I would rather type in those into a file, and run them in one command (if that could called as a single step for you)...
Something like
file.sql
begin
loop
--do stuff to populate a global temporary table. I'll call it 'TempTable'
end loop;
end;
/
Select *
from TempTable
/
And run it as:
prompt> sqlplus /#db #file.sql
If you give us more details like how you populate the GTT, perhaps we might find a way to do it in a single step.
Yes, but it's not trivial.
create global temporary table my_gtt
( ... )
on commit preserve rows;
create or replace type my_gtt_rowtype as object
( [columns definition] )
/
create or replace type my_gtt_tabtype as table of my_gtt_rowtype
/
create or replace function pipe_rows_from_gtt
return my_gtt_tabtype
pipelined
is
pragma autonomous_transaction;
type rc_type is refcursor;
my_rc rc_type;
my_output_rec my_gtt_rectype := my_gtt_rectype ([nulls for each attribute]);
begin
delete from my_gtt;
insert into my_gtt ...
commit;
open my_rc for select * from my_gtt;
loop
fetch my_rc into my_output_rec.attribute1, my_output_rec.attribute1, etc;
exit when my_rc%notfound;
pipe_row (my_output_rec);
end loop;
close my_rc;
return;
end;
/
I don't know it the autonomous transaction pragma is required - but I suspect it is, otherwise it'll throw errors about functions performing DML.
We use code like this to have reporting engines which can't perform procedural logic build the global temporary tables they use (and reuse) in various subreports.
In oracle, an extra table to store intermediate results is very seldom needed. It might help to make things easier to understand. When you are able to write SQL to fill the intermediate table, you can certainly query the rows in a single step without having to waste time by filling a GTT. If you are using pl/sql to populate the GTT, see if this can be corrected to be pure SQL. That will almost certainly give you a performance benefit.