Optimize Oracle 11g Procedure - oracle

I have a procedure to find the first, last, max and min prices for a series of transactions in a very large table which is organized by date, object name, and a code. I also need the sum of quantities transacted. There are about 3 billion rows in the table and this procedure takes many days to run. I would like to cut that time down as much as possible. I have an index on the distinct fields in the trans table, and looking at the explain plan on the select portion of the queries, the index is being used. I am open to suggestions on an alternate approach. I use Oracle 11g R2. Thank you.
declare
cursor c_iter is select distinct dt, obj, cd from trans;
r_iter c_iter%ROWTYPE;
v_fir number(15,8);
v_las number(15,8);
v_max number(15,8);
v_min number(15,8);
v_tot number;
begin
open c_iter;
loop
fetch c_iter into r_iter;
exit when c_iter%NOTFOUND;
select max(fir), max(las) into v_fir, v_las
from
( select
first_value(prc) over (order by seq) as "FIR",
first_value(prc) over (order by seq desc) as "LAS"
from trans
where dt = r_iter.DT and obj = r_iter.OBJ and cd = r_iter.CD );
select max(prc), min(prc), sum(qty) into v_max, v_min, v_tot
from trans
where dt = r_iter.DT and obj = r_iter.OBJ and cd = r_iter.CD;
insert into stats (obj, dt, cd, fir, las, max, min, tot )
values (r_iter.OBJ, r_iter.DT, r_iter.CD, v_fir, v_las, v_max, v_min, v_tot);
commit;
end loop;
close c_iter;
end;

alter session enable parallel dml;
insert /*+ append parallel(stats)*/
into stats(obj, dt, cd, fir, las, max, min, tot)
select /*+ parallel(trans) */ obj, dt, cd
,max(prc) keep (dense_rank first order by seq) fir
,max(prc) keep (dense_rank first order by seq desc) las
,max(prc) max, min(prc) min, sum(qty) tot
from trans
group by obj, dt, cd;
commit;
A single SQL statement is usually significantly faster than multiple SQL statements. They sometimes require more resources, like more temporary tablespace, but your distinct cursor is probably already sorting the entire table on disk anyway.
You may want to also enable parallel DML and parallel query, although depending on your object and system settings this may already be happening. (And it may not necessarily be a good thing, depending on your resources, but it usually helps large queries.)
Parallel write and APPEND should improve performance if the SQL writes a lot of data, but it also means that the new table will not be recoverable until the next backup. (Parallel DML will automatically use direct path writes, but I usually include APPEND anyway just in case the parallelism doesn't work correctly.)
There's a lot to consider, even for such a small query, but this is where I'd start.

Not the solid answer I'd like to give, but a few things to consider:
The first would be using a bulk collect. However, since you're using 11g, hopefully this is already being done for you automatically.
Do you really need to commit after every single iteration? I could be wrong, but am guessing this is one of your top time consumers.
Finally, +1 for jonearles' answer. (I wasn't sure if I'd be able to write everything into a single SQL query, but I was going to suggest this as well.)

You could try and make the query run in parallel, there is a reasonable Oracle White Paper on this here. This isn't an Oracle feature that I've ever had to use myself so I've no first hand experience of it to pass on. You will also need to have enough resources free on the Oracle server to allow you run the parallel processes that this will create.

Related

Oracle insert 1000 rows at a time

I would like to insert 1000 rows at a time with oracle
Example:
INSERT INTO MSG(AUTHOR)
SELECT AUTHOR FROM oldDB.MSGLOG
This insert is taking a very long time but if I limit it with ROWNUM <= 1000 it will insert right away so I want to create an import that goes throuhg my X number of rows and inserts 1000 at at time.
Thanks
It is rather doubtful that this will really improve performance particularly given the simplicity of the SELECT statement. That must be doing either a full scan of the table or of an index on author. If that scan is slow, you're much better off diagnosing the underlying problem rather than trying to work around it (for example, perhaps oldDB.MsgLog has a number of empty blocks below the high water mark that forces a full table scan to read many more blocks than is strictly necessary).
If you really want to write some more verbose and less efficient PL/SQL to accomplish the task, though, you certainly can
DECLARE
TYPE tbl_authors IS TABLE OF msg.author%TYPE;
l_authors tbl_authors;
CURSOR author_cursor
IS SELECT author
FROM oldDB.MsgLog;
BEGIN
OPEN author_cursor;
LOOP
FETCH author_cursor
BULK COLLECT INTO l_authors
LIMIT 1000;
EXIT WHEN l_authors.count = 0;
FORALL i IN 1..l_authors.count
INSERT INTO msg( author )
VALUES( l_authors(i) );
END LOOP;
END;

Why Oracle BULK DML operations are faster [duplicate]

Can you help me to understand this phrase?
Without the bulk bind, PL/SQL sends a SQL statement to the SQL engine
for each record that is inserted, updated, or deleted leading to
context switches that hurt performance.
Within Oracle, there is a SQL virtual machine (VM) and a PL/SQL VM. When you need to move from one VM to the other VM, you incur the cost of a context shift. Individually, those context shifts are relatively quick, but when you're doing row-by-row processing, they can add up to account for a significant fraction of the time your code is spending. When you use bulk binds, you move multiple rows of data from one VM to the other with a single context shift, significantly reducing the number of context shifts, making your code faster.
Take, for example, an explicit cursor. If I write something like this
DECLARE
CURSOR c
IS SELECT *
FROM source_table;
l_rec source_table%rowtype;
BEGIN
OPEN c;
LOOP
FETCH c INTO l_rec;
EXIT WHEN c%notfound;
INSERT INTO dest_table( col1, col2, ... , colN )
VALUES( l_rec.col1, l_rec.col2, ... , l_rec.colN );
END LOOP;
END;
then every time I execute the fetch, I am
Performing a context shift from the PL/SQL VM to the SQL VM
Asking the SQL VM to execute the cursor to generate the next row of data
Performing another context shift from the SQL VM back to the PL/SQL VM to return my single row of data
And every time I insert a row, I'm doing the same thing. I am incurring the cost of a context shift to ship one row of data from the PL/SQL VM to the SQL VM, asking the SQL to execute the INSERT statement, and then incurring the cost of another context shift back to PL/SQL.
If source_table has 1 million rows, that's 4 million context shifts which will likely account for a reasonable fraction of the elapsed time of my code. If, on the other hand, I do a BULK COLLECT with a LIMIT of 100, I can eliminate 99% of my context shifts by retrieving 100 rows of data from the SQL VM into a collection in PL/SQL every time I incur the cost of a context shift and inserting 100 rows into the destination table every time I incur a context shift there.
If can rewrite my code to make use of bulk operations
DECLARE
CURSOR c
IS SELECT *
FROM source_table;
TYPE nt_type IS TABLE OF source_table%rowtype;
l_arr nt_type;
BEGIN
OPEN c;
LOOP
FETCH c BULK COLLECT INTO l_arr LIMIT 100;
EXIT WHEN l_arr.count = 0;
FORALL i IN 1 .. l_arr.count
INSERT INTO dest_table( col1, col2, ... , colN )
VALUES( l_arr(i).col1, l_arr(i).col2, ... , l_arr(i).colN );
END LOOP;
END;
Now, every time I execute the fetch, I retrieve 100 rows of data into my collection with a single set of context shifts. And every time I do my FORALL insert, I am inserting 100 rows with a single set of context shifts. If source_table has 1 million rows, this means that I've gone from 4 million context shifts to 40,000 context shifts. If context shifts accounted for, say, 20% of the elapsed time of my code, I've eliminated 19.8% of the elapsed time.
You can increase the size of the LIMIT to further reduce the number of context shifts but you quickly hit the law of diminishing returns. If you used a LIMIT of 1000 rather than 100, you'd eliminate 99.9% of the context shifts rather than 99%. That would mean that your collection was using 10x more PGA memory, however. And it would only eliminate 0.18% more elapsed time in our hypothetical example. You very quickly reach a point where the additional memory you're using adds more time than you save by eliminating additional context shifts. In general, a LIMIT somewhere between 100 and 1000 is likely to be the sweet spot.
Of course, in this example, it would be more efficient still to eliminate all context shifts and do everything in a single SQL statement
INSERT INTO dest_table( col1, col2, ... , colN )
SELECT col1, col2, ... , colN
FROM source_table;
It would only make sense to resort to PL/SQL in the first place if you're doing some sort of manipulation of the data from the source table that you can't reasonably implement in SQL.
Additionally, I used an explicit cursor in my example intentionally. If you are using implicit cursors, in recent versions of Oracle, you get the benefits of a BULK COLLECT with a LIMIT of 100 implicitly. There is another StackOverflow question that discusses the relative performance benefits of implicit and explicit cursors with bulk operations that goes into more detail about those particular wrinkles.
AS I understand this, there are two engine involved, PL/SQL engine and SQL Engine. Executing a query that make use of one engine at a time is more efficient than switching between the two
Example:
INSERT INTO t VALUES(1)
is processed by SQL engine while
FOR Lcntr IN 1..20
END LOOP
is executed by PL/SQL engine
If you combine the two statement above, putting INSERT in the loop,
FOR Lcntr IN 1..20
INSERT INTO t VALUES(1)
END LOOP
Oracle will be switching between the two engines, for the each (20) iterations.
In this case BULK INSERT is recommended which makes use of PL/SQL engine all through the execution

A fast query that selects the number of rows in each table

I want a query that selects the number of rows in each table
but they are NOT updated statistically .So such query will not be accurate:
select table_name, num_rows from user_tables
i want to select several schema and each schema has minimum 500 table some of them contain a lot of columns . it will took for me days if i want to update them .
from the site ask tom he suggest a function includes this query
'select count(*)
from ' || p_tname INTO l_columnValue;
such query with count(*) is really slow and it will not give me fast results.
Is there a query that can give me how many rows are in table in a fast way ?
You said in a comment that you want to delete (drop?) empty tables. If you don't want an exact count but only want to know if a table is empty you can do a shortcut count:
select count(*) from table_name where rownum < 2;
The optimiser will stop when it reaches the first row - the execution plan shows a 'count stopkey' operation - so it will be fast. It will return zero for an empty table, and one for a table with any data - you have no idea how much data, but you don't seem to care.
You still have a slight race condition between the count and the drop, of course.
This seems like a very odd thing to want to do - either your application uses the table, in which case dropping it will break something even if it's empty; or it doesn't, in which case it shouldn't matter whether it has (presumably redundant) and it can be dropped regardless. If you think there might be confusion, that sounds like your source (including DDL) control needs some work, maybe?
To check if either table in two schemas have a row, just count from both of them; either with a union:
select max(c) from (
select count(*) as c from schema1.table_name where rownum < 2
union all
select count(*) as c from schema2.table_name where rownum < 2
);
... or with greatest and two sub-selects, e.g.:
select greatest(
(select count(*) from schema1.table_name where rownum < 2),
(select count(*) from schema2.table_name where rownum < 2)
) from dual;
Either would return one if either table has any rows, and would only return zero f they were both empty.
Full Disclosure: I had originally suggested a query that specifically counts a column that's (a) indexed and (b) not null. #AlexPoole and #JustinCave pointed out (please see their comments below) that Oracle will optimize a COUNT(*) to do this anyway. As such, this answer has been altered significantly.
There's a good explanation here for why User_Tables shouldn't be used for accurate row counts, even when statistics are up to date.
If your tables have indexes which can be used to speed up the count by doing an index scan rather than a table scan, Oracle will use them. This will make the counts faster, though not by any means instantaneous. That said, this is the only way I know to get an accurate count.
To check for empty (zero row) tables, please use the answer posted by Alex Poole.
You could make a table to hold the counts of each table. Then, set a trigger to run on INSERT for each of the tables you're counting that updates the main table.
You'd also need to include a trigger for DELETE.

Oracle: Bulk Collect performance

Can you help me to understand this phrase?
Without the bulk bind, PL/SQL sends a SQL statement to the SQL engine
for each record that is inserted, updated, or deleted leading to
context switches that hurt performance.
Within Oracle, there is a SQL virtual machine (VM) and a PL/SQL VM. When you need to move from one VM to the other VM, you incur the cost of a context shift. Individually, those context shifts are relatively quick, but when you're doing row-by-row processing, they can add up to account for a significant fraction of the time your code is spending. When you use bulk binds, you move multiple rows of data from one VM to the other with a single context shift, significantly reducing the number of context shifts, making your code faster.
Take, for example, an explicit cursor. If I write something like this
DECLARE
CURSOR c
IS SELECT *
FROM source_table;
l_rec source_table%rowtype;
BEGIN
OPEN c;
LOOP
FETCH c INTO l_rec;
EXIT WHEN c%notfound;
INSERT INTO dest_table( col1, col2, ... , colN )
VALUES( l_rec.col1, l_rec.col2, ... , l_rec.colN );
END LOOP;
END;
then every time I execute the fetch, I am
Performing a context shift from the PL/SQL VM to the SQL VM
Asking the SQL VM to execute the cursor to generate the next row of data
Performing another context shift from the SQL VM back to the PL/SQL VM to return my single row of data
And every time I insert a row, I'm doing the same thing. I am incurring the cost of a context shift to ship one row of data from the PL/SQL VM to the SQL VM, asking the SQL to execute the INSERT statement, and then incurring the cost of another context shift back to PL/SQL.
If source_table has 1 million rows, that's 4 million context shifts which will likely account for a reasonable fraction of the elapsed time of my code. If, on the other hand, I do a BULK COLLECT with a LIMIT of 100, I can eliminate 99% of my context shifts by retrieving 100 rows of data from the SQL VM into a collection in PL/SQL every time I incur the cost of a context shift and inserting 100 rows into the destination table every time I incur a context shift there.
If can rewrite my code to make use of bulk operations
DECLARE
CURSOR c
IS SELECT *
FROM source_table;
TYPE nt_type IS TABLE OF source_table%rowtype;
l_arr nt_type;
BEGIN
OPEN c;
LOOP
FETCH c BULK COLLECT INTO l_arr LIMIT 100;
EXIT WHEN l_arr.count = 0;
FORALL i IN 1 .. l_arr.count
INSERT INTO dest_table( col1, col2, ... , colN )
VALUES( l_arr(i).col1, l_arr(i).col2, ... , l_arr(i).colN );
END LOOP;
END;
Now, every time I execute the fetch, I retrieve 100 rows of data into my collection with a single set of context shifts. And every time I do my FORALL insert, I am inserting 100 rows with a single set of context shifts. If source_table has 1 million rows, this means that I've gone from 4 million context shifts to 40,000 context shifts. If context shifts accounted for, say, 20% of the elapsed time of my code, I've eliminated 19.8% of the elapsed time.
You can increase the size of the LIMIT to further reduce the number of context shifts but you quickly hit the law of diminishing returns. If you used a LIMIT of 1000 rather than 100, you'd eliminate 99.9% of the context shifts rather than 99%. That would mean that your collection was using 10x more PGA memory, however. And it would only eliminate 0.18% more elapsed time in our hypothetical example. You very quickly reach a point where the additional memory you're using adds more time than you save by eliminating additional context shifts. In general, a LIMIT somewhere between 100 and 1000 is likely to be the sweet spot.
Of course, in this example, it would be more efficient still to eliminate all context shifts and do everything in a single SQL statement
INSERT INTO dest_table( col1, col2, ... , colN )
SELECT col1, col2, ... , colN
FROM source_table;
It would only make sense to resort to PL/SQL in the first place if you're doing some sort of manipulation of the data from the source table that you can't reasonably implement in SQL.
Additionally, I used an explicit cursor in my example intentionally. If you are using implicit cursors, in recent versions of Oracle, you get the benefits of a BULK COLLECT with a LIMIT of 100 implicitly. There is another StackOverflow question that discusses the relative performance benefits of implicit and explicit cursors with bulk operations that goes into more detail about those particular wrinkles.
AS I understand this, there are two engine involved, PL/SQL engine and SQL Engine. Executing a query that make use of one engine at a time is more efficient than switching between the two
Example:
INSERT INTO t VALUES(1)
is processed by SQL engine while
FOR Lcntr IN 1..20
END LOOP
is executed by PL/SQL engine
If you combine the two statement above, putting INSERT in the loop,
FOR Lcntr IN 1..20
INSERT INTO t VALUES(1)
END LOOP
Oracle will be switching between the two engines, for the each (20) iterations.
In this case BULK INSERT is recommended which makes use of PL/SQL engine all through the execution

CTE With Insert In Oracle

i am running a query in oracle with CTE.
When i execute the query it works fine in select statement but when i use insert statement it takes ample of time to execute.Any help here is the code
INSERT INTO port_weeklydailypricesTest (co_code,start_dtm,end_dtm)
SELECT * FROM
(
WITH CTE(co_code, start_dtm, end_dtm) AS
(
SELECT co_code ,
CAST(NEXT_DAY(MIN(dlyprice_date),'FRIDAY')-6 AS DATE) start_dtm ,
CAST(NEXT_DAY(MIN(dlyprice_date),'FRIDAY') AS DATE) end_dtm
FROM feed_dlyprice
GROUP BY co_code
UNION ALL
SELECT co_code ,
CAST(TO_CHAR(end_dtm + INTERVAL '1' DAY,'DD-MON-YYYY') AS DATE),
CAST(TO_CHAR(end_dtm + INTERVAL '7' DAY,'DD-MON-YYYY') AS DATE)
FROM CTE
WHERE CAST(end_dtm AS DATE) <= TO_CHAR(TO_DATE(SYSDATE+1,'DD-MON-YYYY'))
)
SELECT co_code,start_dtm,end_dtm
FROM CTE
);
If, as you say, the performance of the SELECT on its own is satisfactory the problem must lie with the INSERT part of the statement.
There are a number of things which might cause an insert to run slow:
The most likely is the presence of a trigger on the target table which executes something very expensive.
Another possibility is that the insert is waiting on a locked resource (say some other process has an exclusive table level lock on the target table, or some other shared resource such as a code control table).
it could be a storage allocation issue, chaining or row migration, too many indexes or lots of derived columns.
perhaps it is down to hardware - underpowered network, dodgy interconnects, a bad disk.
This is by no means exhaustive. The items at the top are application issues which you should be able to investigate and resolve. The further down the list you go the more likely it is that you will need the assistance on an on-site DBA.

Resources