Performance of using a nested table inside the IN clause - Oracle - performance

I'm trying to use a nested table inside the IN clause in a PL-SQL block.
First, I have defined a TYPE:
CREATE OR REPLACE TYPE VARCHAR_ARRAY AS TABLE OF VARCHAR2(32767);
Here is my PL-SQL block using the 'BULK COLLECT INTO':
DECLARE
COL1 VARCHAR2(50) := '123456789';
N_TBL VARCHAR_ARRAY := VARCHAR_ARRAY();
C NUMBER;
BEGIN
-- Print timestamp
DBMS_OUTPUT.PUT_LINE('START: ' || TO_CHAR(SYSTIMESTAMP ,'dd-mm-yyyy hh24:mi:ss.FF'));
SELECT COLUMN1
BULK COLLECT INTO N_TBL
FROM MY_TABLE
WHERE COLUMN1 = COL1;
SELECT COUNT(COLUMN1)
INTO C
FROM MY_OTHER_TABLE
WHERE COLUMN1 IN (SELECT column_value FROM TABLE(N_TBL));
-- Print timestamp
DBMS_OUTPUT.PUT_LINE('ENDED: ' || TO_CHAR(SYSTIMESTAMP ,'dd-mm-yyyy hh24:mi:ss.FF'));
END;
And the output is:
START: 01-08-2014 12:36:14.997
ENDED: 01-08-2014 12:36:17.554
It takes more than 2.5 seconds (2.557 seconds exactly)
Now, If I replace the nested table by a subquery, like this:
DECLARE
COL1 VARCHAR2(50) := '123456789';
N_TBL VARCHAR_ARRAY := VARCHAR_ARRAY();
C NUMBER;
BEGIN
-- Print timestamp
DBMS_OUTPUT.PUT_LINE('START: ' || TO_CHAR(SYSTIMESTAMP ,'dd-mm-yyyy hh24:mi:ss.FF'));
SELECT COUNT(COLUMN1)
INTO C
FROM MY_OTHER_TABLE
WHERE COLUMN1 IN (
-- Nested table replaced by a subquery
SELECT COLUMN1
FROM MY_TABLE
WHERE COLUMN1 = COL1
);
-- Print timestamp
DBMS_OUTPUT.PUT_LINE('ENDED: ' || TO_CHAR(SYSTIMESTAMP ,'dd-mm-yyyy hh24:mi:ss.FF'));
END;
The output is:
START: 01-08-2014 12:36:08.889
ENDED: 01-08-2014 12:36:08.903
It takes only 14 milliseconds...!!!
What could I do to enhance this PL-SQL block ?
Is there any database configuration needed?

Are the two query plans different?
Assuming that they are, the difference is likely that the optimizer has reasonable estimates about the number of rows the subquery will return and, thus, is able to choose the most efficient plan. When your data is in a nested table (I'd hate to use the word array in the type declaration here since that implies that you're using a varray when you're not), Oracle doesn't have information about how many elements are going to be in the collection. By default, it's going to guess that the collection has as many elements as your data blocks have bytes. So if you have 8k blocks, Oracle will guess that your collection has 8192 elements.
Assuming that your actual query doesn't return anywhere close to 8192 rows and that it actually returns many more or many fewer rows, you can potentially use the cardinality hint to let the optimizer make a more accurate guess. For example, if your query generally returns a few dozen rows, you probably want something like
SELECT COUNT(COLUMN1)
INTO C
FROM MY_OTHER_TABLE
WHERE COLUMN1 IN (SELECT /*+ cardinality(t 50) */ column_value
FROM TABLE(N_TBL) t);
The literal you put in the cardinality hint doesn't need to be particularly accurate, just close to general reality. If the number of rows is completely unknown the dynamic_sampling hint can help.
If you are using Oracle 11g, you may also benefit from cardinality feedback helping the optimizer learn to better estimate the number of elements in a collection.

Related

use values stored in varray sql-oracle?

I am trying to get my last 5 employees (ones with lowest salary) and raise their salary by 5%;
I am using a varray to store their id's but i don't know how to use those ids in a update statement (something like update employees \ set salary = salary * 1.05 \ where id_employee in varray)
here's what i have for now:
DECLARE
TYPE tip_cod IS VARRAY(20) OF NUMBER;
coduri tip_cod;
BEGIN
SELECT employee_id
BULK COLLECT INTO coduri
FROM (
SELECT employee_id
from employees
where commission_pct IS NULL
order by salary asc
)
WHERE ROWNUM < 6;
-- after i store their ids in coduri i want to update their salary
FOR i IN 1 .. coduri.COUNT LOOP
DBMS_OUTPUT.PUT_LINE(coduri(i));
END LOOP;
END;
/
If you are practicing the use of loops to do things one at a time (not a good approach for this task!) you can replace your calls to put_line with insert statements, something like
...
update employees set salary = 1.05 * salary where employee_id = coduri(i);
...
The beauty of PL/SQL is that you can embed such plain-SQL statements directly within PL/SQL code, no need for preparation of any kind.
After you are done with the updates, you will need to commit for the changes to be committed - usually after the procedure, not within it.
Alternatively, if you want a single update (with an in condition), you will need to define the varray table at the schema level, not within the anonymous block (or procedure). This is because the update statement is a SQL statement, which can't "see" locally defined data types. Then, in the update statement you will need to use the table operator to unwind its members. Something like this:
create type tip_cod is varray(20) of number;
/
DECLARE
coduri tip_cod;
BEGIN
SELECT employee_id
BULK COLLECT INTO coduri
FROM (
SELECT employee_id
from employees
where commission_pct IS NULL
order by salary asc
)
WHERE ROWNUM < 6;
update employees set salary = 1.05 * salary
where employee_id in (select * from table(coduri));
END;
/
commit;
Notice how the varray type is defined on its own, then it is used in the PL/SQL block. Also don't forget the commit at the end.
When you work with collection types, there is also the member of predicate, as in employee_id member of coduri. Alas, this only works with locally-defined data types; since the varray type must be declared at the schema level (so that it can be used in a SQL statement within the PL/SQL code), you can't use member of and you must unwind the array explicitly, with the table operator.
There id much more to collections (Oracle term for array). There are 3 types:
Varrays
Associative Arrays
Nested Tables
If you want to understand collections you must understand all 3. (imho: Of the 3 Varrays are the most limited).
Mathguy presents 1 option, "casting" the array as a table, via the TABLE(...) function. I'll present another: Nested Table combined with Bulk Collect/Forall combination to accomplish he update.
declare
type employee_id_att is table of hr.employees.employee_id%type;
employee_id_array employee_id_att;
begin
select employee_id
bulk collect
into employee_id_array
from hr.employees
where commission_pct is null
order by salary
fetch first 5 rows only;
forall emp_indx in 1 .. employee_id_array.count
update hr.employees
set salary = 1.05 * salary
where employee_id = employee_id_array(emp_indx);
end ;
/
Take Away: There is much, much more to collections than defining a LOOP. Spend some time with the documentation and write tests and examine the results. But the important thing when you do not understand write some code. It will probably fail, that is good, so write something else. Do not be afraid of errors/ exceptions, in development they are friend. And if there is something you cannot understand then post a specific question. Be prepared to show several failed attempts; that will give the community an idea of your thinking and whether you are on the correct path or not.

Delete is taking considerable amount of time

I have written a plsql block to delete some records for a bunch of tables.
so to identify the records to be deleted I have created a cursor on top that query.
declare
type t_guid_invoice is ref cursor;
c_invoice t_guid_invoice;
begin
open c_invoice for
select * from a,b where a.col=b.col ;--(quite a complex join,renders 200k records)
loop fetch c_invoice into col1,col2,col3;
exit when c_invoice%NOTFOUND;
begin
DELETE
FROM tab2
WHERE cola= col1;
if SQL%rowcount > 0 then
dbms_output.put_line ( 'INFO: tab2 for ' || col1|| '/' || col2|| ' removed.');
else
dbms_output.put_line ( 'WARN: No tab2 for ' || col1|| '/' || col2|| ' found!');
end if;
eXception
when others then
dbms_output.put_line ( 'ERR: Problems while deleting tab2 for ' || col1|| '/' || col2 );
dbms_output.put_line ( SQLERRM );
end;
....
end loop;
This continues to loop through about 26 tables, there are some tables which are as big as 60 million records.
deletion is based on primary key in each table. All triggers are disabled before deletion process.
if I try to delete 10k records, it loops through 10k times, deleting multiple rows in each table but its taking as long as 30 minutes. There is no commit after each block, since I have to cater to simulation mode too.
Any suggestions to speed up the process? Thanks!
For sure, if you loop 10k times, all those DBMS_OUTPUT.PUT_LINE calls will slow things down (even if you aren't doing anything "smart") (and I wonder whether buffer is large enough). If you want to log what's going on, create a log table and an autonomous transaction procedure which will insert that info (and commit).
Apart from that, are tables properly indexed? E.g. that would be cola column in tab2 table (in code you posted). Did you collect statistics on tables and indexes? Probably wouldn't harm if you do that for the whole schema.
Did you check explain plan?
Do you know what takes the most time? Is it a ref cursor query (so it has to be optimized), or is it deleting itself?
Can't you avoid loop entirely? Row-by-row processing is slow. For example, instead of using ref cursor, create a table out of it, index it, and use it as
create table c_invoice as
select * from a join b on a.col = b.col;
create index i1inv_col1 on c_invoice (col1);
delete from tab2 t
where exists (select null
from c_invoice c
where c.col1 = t.cola
);
You typically never want to delete a large number of rows from a table in a loop.
You want to use one DELETE statement with an appropriate WHERE condition.
Additionaly while processing a large number of rows you typically do not want to use an index.
So your first step would be to check the rows that would not be deleted (your warnings)
You get the keys with the following query and you may log them
select a.col from a,b where a.col=b.col
minus
select cola from tab2;
In the second step you delete all rows with one statement.
delete
from tab2
where cola in (select a.col from a,b where a.col=b.col);
In problems check the execution plan, you expect TABLE ACCESS FULL (INDEX FAST FULL SCAN is fine as well) on all sources combined with a HASH JOIN.

Oracle access varray elements in SQL

I'm playing around with array support in Oracle and hit a roadblock regarding array access within a SQL query. I'm using the following schema:
create type smallintarray as varray(10) of number(3,0);
create table tbl (
id number(19,0) not null,
the_array smallintarray,
primary key (id)
);
What I would like to do is get the id and the first element i.e. at index 1 of the array. In PostgreSQL I could write select id, the_array[1] from tbl t but I don't see how I could do that with Oracle. I read that array access by index is only possible in PL/SQL, which would be fine if I could return a "decorated cursor" to achieve the same result through JDBC, but I don't know if that's possible.
DECLARE
c1 SYS_REFCURSOR;
varr smallintarray2;
BEGIN
OPEN c1 FOR SELECT t.id, t.THE_ARRAY from tbl t;
-- SELECT t.THE_ARRAY INTO varr FROM table_with_enum_arrays2 t;
-- return a "decorated cursor" with varr(1) at select item position 1
dbms_sql.return_result(c1);
END;
You can do this in plain SQL; it's not pretty, but it does work. You would prefer that Oracle had syntax to hide this from the programmer (and perhaps it does, at least in the most recent versions; I am still stuck at 12.2).
select t.id, q.array_element
from tbl t cross apply
( select column_value as array_element,
rownum as ord
from table(the_array)
) q
where ord = 1
;
EDIT If order of generating the elements through the table operator is a concern, you could do something like this (in Oracle 12.1 and higher; otherwise the function can't be part of the query itself, but it can be defined on its own):
with
function select_element(arr smallintarray, i integer)
return number
as
begin
return arr(i);
end;
select id, select_element(the_array, 1) as the_array_1
from tbl
/
First of all, please don't do that on production. Use tables instead of storing arrays within a table.
Answer to your question is to use column as a table source
SELECT t.id, ta.*
from tbl t,
table(t.THE_ARRAY) ta
order by column_value
-- offset 1 row -- in case if sometime you'll need to skip a row
fetch first 1 row only;
UPD: as for ordering the array I can only say playing with 2asc/desc" parameters provided me with results I've expected - it has been ordered ascending or descending.
UPD2: found a cool link to description of performance issues might happen

Oracle CLOB column and LAG

I'm facing a problem when I try to use LAG function on CLOB column.
So let's assume we have a table
create table test (
id number primary key,
not_clob varchar2(255),
this_is_clob clob
);
insert into test values (1, 'test1', to_clob('clob1'));
insert into test values (2, 'test2', to_clob('clob2'));
DECLARE
x CLOB := 'C';
BEGIN
FOR i in 1..32767
LOOP
x := x||'C';
END LOOP;
INSERT INTO test(id,not_clob,this_is_clob) values(3,'test3',x);
END;
/
commit;
Now let's do a select using non-clob columns
select id, lag(not_clob) over (order by id) from test;
It works fine as expected, but when I try the same with clob column
select id, lag(this_is_clob) over (order by id) from test;
I get
ORA-00932: inconsistent datatypes: expected - got CLOB
00932. 00000 - "inconsistent datatypes: expected %s got %s"
*Cause:
*Action:
Error at Line: 1 Column: 16
Can you tell me what's the solution of this problem as I couldn't find anything on that.
The documentation says the argument for any analytic function can be any datatype but it seems unrestricted CLOB is not supported.
However, there is a workaround:
select id, lag(dbms_lob.substr(this_is_clob, 4000, 1)) over (order by id)
from test;
This is not the whole CLOB but 4k should be good enough in many cases.
I'm still wondering what is the proper way to overcome the problem
Is upgrading to 12c an option? The problem is nothing to do with CLOB as such, it's the fact that Oracle has a hard limit for strings in SQL of 4000 characters. In 12c we have the option to use extended data types (providing we can persuade our DBAs to turn it on!). Find out more.
Some of the features may not work properly in SQL when using CLOBs(like DISTINCT , ORDER BY GROUP BY etc. Looks like LAG is also one of them but, I couldn't find anywhere in docs.
If your values in the CLOB columns are always less than 4000 characters, you may use TO_CHAR
select id, lag( TO_CHAR(this_is_clob)) over (order by id) from test;
OR
convert it into an equivalent SELF JOIN ( may not be as efficient as LAG )
SELECT a.id,
b.this_is_clob AS lagging
FROM test a
LEFT JOIN test b ON b.id < a.id;
Demo
I know this is an old question, but I think I found an answer which eliminates the need to restrict the CLOB length and wanted to share it. Utilizing CTE and recursive subqueries, we can replicate the lag functionality with CLOB columns.
First, let's take a look at my "original" query:
WITH TEST_TABLE AS
(
SELECT LEVEL ORDER_BY_COL,
TO_CLOB(LEVEL) AS CLOB_COL
FROM DUAL
CONNECT BY LEVEL <= 10
)
SELECT tt.order_by_col,
tt.clob_col,
LAG(tt.clob_col) OVER (ORDER BY tt.order_by_col)
FROM test_table tt;
As expected, I get the following error:
ORA-00932: inconsistent datatypes: expected - got CLOB
Now, lets look at the modified query:
WITH TEST_TABLE AS
(
SELECT LEVEL ORDER_BY_COL,
TO_CLOB(LEVEL) AS CLOB_COL
FROM DUAL
CONNECT BY LEVEL <= 10
),
initial_pull AS
(
SELECT tt.order_by_col,
LAG(tt.order_by_col) OVER (ORDER BY tt.order_by_col) AS PREV_ROW,
tt.clob_col
FROM test_table tt
),
recursive_subquery (order_by_col, prev_row, clob_col, prev_clob_col) AS
(
SELECT ip.order_by_col, ip.prev_row, ip.clob_col, NULL
FROM initial_pull ip
WHERE ip.prev_row IS NULL
UNION ALL
SELECT ip.order_by_col, ip.prev_row, ip.clob_col, rs.clob_col
FROM initial_pull ip
INNER JOIN recursive_subquery rs ON ip.prev_row = rs.order_by_col
)
SELECT rs.order_by_col, rs.clob_col, rs.prev_clob_col
FROM recursive_subquery rs;
So here is how it works.
I create the TEST_TABLE, this really is only for the example as you should already have this table somewhere in your schema.
I create a CTE of the data I want to pull, plus a LAG function on the primary key (or a unique column) in the table partitioned and ordered in the same way I would have in my original query.
Create a recursive subquery using the initial row as the root and descending row by row joining on the lagged column. Returning both the CLOB column from the current row and the CLOB column from its parent row.

Bulk Insert with large static data in Oracle

I have a simple table with one column of numbers. I want to load about 3000 numbers in it. I want to do that in memory, without using SQL*Loader. I tried
INSERT ALL
INTO t_table (code) VALUES (n1)
INTO t_table (code) VALUES (n2)
...
...
INTO t_table (code) VALUES (n3000)
SELECT * FROM dual
But I fails at 1000 values. What should I do ? Is SQL*Loader the only way ? Can I do LOAD with SQL only ?
Presumably you have an initial value of n. If so, this code will populate code with values n to n+2999 :
insert into t_table (code)
select (&N + level ) - 1
from dual
connect by level <=3000
This query uses a SQL*Plus substitution variable to post the initial value of n. Other clients will need to pass the value in a different way.
"Assume that I am in c++ with a stl::vector, what query should I
write ?"
So when you wrote n3000 what you really meant was n(3000). It's easy enough to use an array in SQL. This example uses one of Oracle's pre-defined collections, a table of type NUMBER:
declare
ids system.number_tbl_type;
begin
insert into t_table (code)
select column_value
from table ( select ids from dual )
;
end;
As for mapping your C++ vector to Oracle types, that's a different question (and one which I can't answer).

Resources