Slow Update When Using Oracle PL/SQL Table - oracle

We're using a PL/SQL table (named pTable) to collect a number of ids to be updated.
However, the statement
UPDATE aTable
SET aColumn = 1
WHERE id IN (SELECT COLUMN_VALUE
FROM TABLE (pTable));
takes a long time to execute.
It seems that the optimizer comes up with a very bad execution plan, instead of using the index that is defined on id (as the primary key) it decides to use a full table scan on the aTable. pTable usually contains very few values (in most cases just one).
What can we do to make this faster? The best we've come up with is to handle low pTable.Count (1 and 2) as special cases, but that is certainly not very elegant.
Thanks for all the great suggestions. I wrote about this issue in my blog at http://smartercoding.blogspot.com/2010/01/performance-issues-using-plsql-tables.html.

You can try the cardinality hint. This is good if you know (roughly) the number of rows in the collection.
UPDATE aTable
SET aColumn = 1
WHERE id IN (SELECT /*+ cardinality( pt 10 ) */
COLUMN_VALUE
FROM TABLE (pTable) pt );

Here's another approach. Create a temporary table:
create global temporary table pTempTable ( id int primary key )
on commit delete rows;
To perform the update, populate pTempTable with the contents of pTable and execute:
update
(
select aColumn
from aTable aa join pTempTable pp on aa.id = pp.id
)
set aColumn = 1;
The should perform reasonably well without resorting to optimizer hints.

The bad execution plan is probably unavoidable (unfortunately). There is no statistics information for the PL/SQL table, so the optimizer has no way of knowing that there are few rows in it. Is it possible to use hints in an UPDATE? If so, you might force use of the index that way.

It helped to tell the optimizer to use the "correct" index instead of going on a wild full-table scan:
UPDATE /*+ INDEX(aTable PK_aTable) */aTable
SET aColumn = 1
WHERE id IN (SELECT COLUMN_VALUE
FROM TABLE (CAST (pdarllist AS list_of_keys)));
I couldn't apply this solution to more complicated scenarios, but found other workarounds for those.

You could try adding a ROWNUM < ... clause.
In this test a ROWNUM < 30 changes the plan to use an index.
Of course that depends on your set of values having a reasonable maximum size.
create table atable (acolumn number, id number);
insert into atable select rownum, rownum from dual connect by level < 150000;
alter table atable add constraint atab_pk primary key (id);
exec dbms_stats.gather_table_stats(ownname => user, tabname => 'ATABLE');
create type type_coll is table of number(4);
/
declare
v_coll type_coll;
begin
v_coll := type_coll(1,2,3,4);
UPDATE aTable
SET aColumn = 1
WHERE id IN (SELECT COLUMN_VALUE
FROM TABLE (v_coll));
end;
/
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------
UPDATE ATABLE SET ACOLUMN = 1 WHERE ID IN (SELECT COLUMN_VALUE FROM TABLE (:B1 ))
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | | | 142 (100)| |
| 1 | UPDATE | ATABLE | | | | |
|* 2 | HASH JOIN RIGHT SEMI | | 1 | 11 | 142 (8)| 00:00:02 |
| 3 | COLLECTION ITERATOR PICKLER FETCH| | | | | |
| 4 | TABLE ACCESS FULL | ATABLE | 150K| 1325K| 108 (6)| 00:00:02 |
----------------------------------------------------------------------------------------------
declare
v_coll type_coll;
begin
v_coll := type_coll(1,2,3,4);
UPDATE aTable
SET aColumn = 1
WHERE id IN (SELECT COLUMN_VALUE
FROM TABLE (v_coll)
where rownum < 30);
end;
/
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------
UPDATE ATABLE SET ACOLUMN = 1 WHERE ID IN (SELECT COLUMN_VALUE FROM TABLE (:B1 ) WHERE
ROWNUM < 30)
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | | | 31 (100)| |
| 1 | UPDATE | ATABLE | | | | |
| 2 | NESTED LOOPS | | 1 | 22 | 31 (4)| 00:00:01 |
| 3 | VIEW | VW_NSO_1 | 29 | 377 | 29 (0)| 00:00:01 |
| 4 | SORT UNIQUE | | 1 | 58 | | |
|* 5 | COUNT STOPKEY | | | | | |
| 6 | COLLECTION ITERATOR PICKLER FETCH| | | | | |
|* 7 | INDEX UNIQUE SCAN | ATAB_PK | 1 | 9 | 0 (0)| |
---------------------------------------------------------------------------------------------------

I wonder if the MATERIALIZE hint in the subselect from the PL/SQL table would force a temp table instantiation and help the optimizer?
UPDATE aTable
SET aColumn = 1
WHERE id IN (SELECT /*+ MATERIALIZE */ COLUMN_VALUE
FROM TABLE (pTable));

Related

Oracle UNION ALL query takes temp space

I have a query like this:
SELECT * FROM TEST1 LEFT OUTER JOIN TEST2 on TEST1.ID=TEST2.ID
UNION ALL
SELECT * FROM TEST3 LEFT OUTER JOIN TEST4 on TEST3.ID=TEST4.ID;
The behavior I see here is, it first join TEST1 and TEST2 tables (billions of rows) and then stores the output in temp tablespace. Then it joins TEST3 and TEST4 and then saves the output in same temp table. And finally select the records from there to display the result.
This behavior I see in both Redshift and Oracle. I was just wondering why it stores the result in temporary segments after getting result from first SELECT. It's time taking as well as eats up the temp space. Can not it just starts displaying the result after 1st SELECT is finishes and then goes for 2nd one (instead of storing).
This answer is somewhat speculative, because I don't have an Oracle doc reference. By inspection, we can imagine instead that you wanted to run the following query:
SELECT * FROM TEST1 JOIN TEST2
UNION ALL
SELECT * FROM TEST3 JOIN TEST4
ORDER BY some_col;
It should be clear that to apply any set operation like ORDER BY, all the records returned from the union query would need to be in one logical place. A temp table would seem to work.
That you are not using ORDER BY appears to not affect the workflow which Oracle is using.
I can also add another reason why Oracle is insisting on using a temp table here. Suppose it would be possible to write both halves of the union directly to the buffer. But what would happen if, at a later date, the size of the total union query suddenly exceeded what the buffer can hold? The answer is that your database would crash. So, using a temp table is a safe bet which should generally always work.
How do you observe this behaviour? By any chance don't you perform INSERT or CREATE TABLE? That would explain your observation, because at the end, all rows are required.
Also if your client has set an option fetch all rows this could be observed.
But in normal case, where the client is interested in few first rows Oracle returns quickly the first available (array size) rows from the first join ignoring the second one.
You may perform this little Gedankenexperiment:
create table test1 as
select rownum id,
lpad('x',1023,'X') pad
from dual connect by level <= 1000000;
Create analog the table 2 to 4.
Now run your query (adapted to valid syntax)
SELECT * FROM TEST1 CROSS JOIN TEST2
UNION ALL
SELECT * FROM TEST3 CROSS JOIN TEST4;
This returns for my the first page in SQL Developer in ca 30 seconds, which somehow disproves your claim.
Simple calculate the required TEMP space for two 10**6 * 10**6 cartesian join with row lenth 1K - this is far above my TEMP configuration.
The one possible way to observe what is Oracle actualy doing is to run the query with the /*+ gather_plan_statistics */ hint.
Than get the SQL_ID of the statement and check the actual rows A-Rowsin the plan
select * from table(dbms_xplan.display_cursor('a9y62gxagups6',null,'ALLSTATS LAST'));
SQL_ID a9y62gxagups6, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ * FROM TEST1 CROSS JOIN TEST2
UNION ALL SELECT * FROM TEST3 CROSS JOIN TEST4
Plan hash value: 1763392637
--------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 50 |00:00:28.52 | 166K| 166K| 142K| | | |
| 1 | UNION-ALL | | 1 | | 50 |00:00:28.52 | 166K| 166K| 142K| | | |
| 2 | MERGE JOIN CARTESIAN| | 1 | 1000G| 50 |00:00:28.52 | 166K| 166K| 142K| | | |
| 3 | TABLE ACCESS FULL | TEST1 | 1 | 1000K| 1 |00:00:00.02 | 4 | 28 | 0 | | | |
| 4 | BUFFER SORT | | 1 | 1000K| 50 |00:00:28.49 | 166K| 166K| 142K| 1255M| 11M| 97M (0)|
| 5 | TABLE ACCESS FULL | TEST2 | 1 | 1000K| 1000K|00:00:03.66 | 166K| 166K| 0 | | | |
| 6 | MERGE JOIN CARTESIAN| | 0 | 1000G| 0 |00:00:00.01 | 0 | 0 | 0 | | | |
| 7 | TABLE ACCESS FULL | TEST3 | 0 | 1000K| 0 |00:00:00.01 | 0 | 0 | 0 | | | |
| 8 | BUFFER SORT | | 0 | 1000K| 0 |00:00:00.01 | 0 | 0 | 0 | 1103M| 10M| |
| 9 | TABLE ACCESS FULL | TEST4 | 0 | 1000K| 0 |00:00:00.01 | 0 | 0 | 0 | | | |
--------------------------------------------------------------------------------------------------------------------------------------
You see, that Oracle
1) full scanned the table2 (row 5)
2) get one row from table1 (row 3)
3) to return to frist 50 rows (row 0)
4) tables 3 and 4 are untached (rows 7 and 9)
You may simple adapt the example to you inner join to see similar results.

Oracle plan_baseline is ignored

I have query with bind variables which comming from outer application.
The optimizer use the the unwanted index and I want to force it use another plan.
So I generate the good plan using index hint and then created the baseline with the plans
and connect the wanted plan to the query sql_id, and change the fixed attribute to 'YES'.
I executed the DBMS_XPLAN.DISPLAY_SQL_PLAN_BASELINE function
and the output shows that the wanted plan marked as fixed=yes.
So why when I'm running the query it still with the bad plan??
The code:
-- Query
SELECT DISTINCT t_01.puid
FROM PWORKSPACEOBJECT t_01 , PPOM_APPLICATION_OBJECT t_02
WHERE ( ( UPPER(t_01.pobject_type) IN ( UPPER( :1 ) , UPPER( :2 ) )
AND ( t_02.pcreation_date >= :3 ) ) AND ( t_01.puid = t_02.puid ) )
-- get the text
select sql_fulltext
from v$sqlarea
where sql_id = '21pts328r2nb7' and rownum = 1;
-- prepare the explain plan
explain plan for
SELECT DISTINCT t_01.puid
FROM PWORKSPACEOBJECT t_01 , PPOM_APPLICATION_OBJECT t_02
WHERE ( ( UPPER(t_01.pobject_type) IN ( UPPER( :1 ) , UPPER( :2 ) )
AND ( t_02.pcreation_date >= :3 ) ) AND ( t_01.puid = t_02.puid ) ) ;
-- we can see that there is no use of index - PIPIPWORKSPACEO_2
select * from table(dbms_xplan.display);
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10382 | 517K| 61553 |
| 1 | HASH UNIQUE | | 10382 | 517K| 61553 |
| 2 | HASH JOIN | | 158K| 7885K| 61549 |
| 3 | INLIST ITERATOR | | | | |
| 4 | TABLE ACCESS BY INDEX ROWID| PWORKSPACEOBJECT | 158K| 4329K| 52689 |
| 5 | INDEX RANGE SCAN | PIPIPWORKSPACEO_3 | 158K| | 534 |
| 6 | INDEX RANGE SCAN | DBTAO_IX1_PPOM | 3402K| 74M| 2911 |
------------------------------------------------------------------------------------
Note
-----
- 'PLAN_TABLE' is old version
-- generate plan with the wanted index
explain plan for
select /*+ index(t_01 PIPIPWORKSPACEO_2)*/ distinct t_01.puid
from pworkspaceobject t_01 , ppom_application_object t_02
where ( ( upper(t_01.pobject_type) in ( upper( :1 ) , upper( :2 ) )
and ( t_02.pcreation_date >= :3 ) ) and ( t_01.puid = t_02.puid ) ) ;
-- the index working - the index used
select * from table(dbms_xplan.display);
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10382 | 517K| 223K|
| 1 | HASH UNIQUE | | 10382 | 517K| 223K|
| 2 | HASH JOIN | | 158K| 7885K| 223K|
| 3 | TABLE ACCESS BY INDEX ROWID| PWORKSPACEOBJECT | 158K| 4329K| 214K|
| 4 | INDEX FULL SCAN | PIPIPWORKSPACEO_2 | 158K| | 162K|
| 5 | INDEX RANGE SCAN | DBTAO_IX1_PPOM | 3402K| 74M| 2911 |
-----------------------------------------------------------------------------------
Note
-----
- 'PLAN_TABLE' is old version
-- get the sql_id of the query with the good index
-- 7t72qvghr0zqh
select sql_id from v$sqlarea where sql_text like 'select /*+ index(t_01 PIPIPWORKSPACEO_2)%';
-- get the plan hash value of the good plan by the sql_id
--4040955653
select plan_hash_value from v$sql_plan where sql_id = '7t72qvghr0zqh';
-- get the plan hash value of the bad plan by the sql_id
--1044780890
select plan_hash_value from v$sql_plan where sql_id = '21pts328r2nb7';
-- load the source plan
begin
dbms_output.put_line(
dbms_spm.load_plans_from_cursor_cache
( sql_id => '21pts328r2nb7' )
);
END;
-- the new base line created with the bad plan
select * from dba_sql_plan_baselines;
-- load the good plan of the second sql_id (with the wanted index)
-- and bind it to the sql_handle of the source query
begin
dbms_output.put_line(
DBMS_SPM.LOAD_PLANS_FROM_CURSOR_CACHE
( sql_id => '7t72qvghr0zqh',
plan_hash_value => 4040955653,
sql_handle => 'SQL_4afac4211aa3317d' )
);
end;
-- new there are 2 plans bind to the same sql_handle and sql_text
select * from dba_sql_plan_baselines;
-- alter the good one to be fixed
begin
dbms_output.put_line(
dbms_spm.alter_sql_plan_baseline
( sql_handle =>
'SQL_4afac4211aa3317d',
PLAN_NAME => 'SQL_PLAN_4pyq444da6cbxf7c97cc7',
ATTRIBUTE_NAME => 'fixed',
ATTRIBUTE_VALUE => 'YES'
)) ;
end;
-- check the good plan - fixed = yes
select * from table(
dbms_xplan.display_sql_plan_baseline (
sql_handle => 'SQL_4afac4211aa3317d',
plan_name => 'SQL_PLAN_4pyq444da6cbxf7c97cc7',
format => 'ALL'));
--------------------------------------------------------------------------------
SQL handle: SQL_4afac4211aa3317d
SQL text: SELECT DISTINCT t_01.puid FROM PWORKSPACEOBJECT t_01 ,
PPOM_APPLICATION_OBJECT t_02 WHERE ( ( UPPER(t_01.pobject_type) IN (
UPPER( :1 ) , UPPER( :2 ) ) AND ( t_02.pcreation_date >= :3 ) ) AND (
t_01.puid = t_02.puid ) )
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Plan name: SQL_PLAN_4pyq444da6cbxf7c97cc7 Plan id: 4157177031
Enabled: YES Fixed: YES Accepted: YES Origin: MANUAL-LOAD
--------------------------------------------------------------------------------
Plan hash value: 4040955653
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10382 | 517K| | 223K (1)| 00:44:37 |
| 1 | HASH UNIQUE | | 10382 | 517K| | 223K (1)| 00:44:37 |
|* 2 | HASH JOIN | | 158K| 7885K| 6192K| 223K (1)| 00:44:37 |
| 3 | TABLE ACCESS BY INDEX ROWID| PWORKSPACEOBJECT | 158K| 4329K| | 214K (1)| 00:42:50 |
|* 4 | INDEX FULL SCAN | PIPIPWORKSPACEO_2 | 158K| | | 162K (1)| 00:32:25 |
|* 5 | INDEX RANGE SCAN | DBTAO_IX1_PPOM | 3402K| 74M| | 2911 (1)| 00:00:35 |
-----------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
3 - SEL$1 / T_01#SEL$1
4 - SEL$1 / T_01#SEL$1
5 - SEL$1 / T_02#SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T_01"."PUID"="T_02"."PUID")
4 - filter(UPPER("POBJECT_TYPE")=UPPER(:1) OR UPPER("POBJECT_TYPE")=UPPER(:2))
5 - access("T_02"."PCREATION_DATE">=:3)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) "T_01"."PUID"[VARCHAR2,15]
2 - (#keys=1) "T_01"."PUID"[VARCHAR2,15]
3 - "T_01"."PUID"[VARCHAR2,15]
4 - "T_01".ROWID[ROWID,10]
5 - "T_02"."PUID"[VARCHAR2,15]
Note
-----
- 'PLAN_TABLE' is old version
-- run explain plan for the query
-- need to use the new plan
declare
v_string clob;
begin
select sql_fulltext
into v_string
from v$sqlarea
where sql_id = '21pts328r2nb7' and rownum = 1;
execute immediate 'explain plan for ' || v_string using '1','1',sysdate;
end;
-- check the plan - still the unwanted index and plan
select * from table(dbms_xplan.display);
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10382 | 517K| 61553 |
| 1 | HASH UNIQUE | | 10382 | 517K| 61553 |
| 2 | HASH JOIN | | 158K| 7885K| 61549 |
| 3 | INLIST ITERATOR | | | | |
| 4 | TABLE ACCESS BY INDEX ROWID| PWORKSPACEOBJECT | 158K| 4329K| 52689 |
| 5 | INDEX RANGE SCAN | PIPIPWORKSPACEO_3 | 158K| | 534 |
| 6 | INDEX RANGE SCAN | DBTAO_IX1_PPOM | 3402K| 74M| 2911 |
------------------------------------------------------------------------------------
Note
-----
- 'PLAN_TABLE' is old version
From a read through of your test case, I suspect the problem is that you're interpreting the FIXED attribute incorrectly.
If you list all the plans for your baseline, you will probably find the original and the loaded cursor plan are both ENABLED and ACCEPTED at the moment. I think what you need to do (based on my own usage of these calls) is use the ENABLED attribute. Set ENABLED to NO for the unwanted plan.
Try:
exec dbms_spm.alter_sql_plan_baseline(
sql_handle=>'SQL_...' -- baseline to update
,plan_name=>'SQL_PLAN_...' -- unwanted plan signature to disable
,attribute_name=>'ENABLED',attribute_value=>'NO')

Optimizing the SORT MERGE join of a MERGE statement

Consider the problem of applying changes to an aggregate table. Row that exist must be updated while new rows must be inserted. My approach was as follows:
Insert all changes in a temporary table (100K at a time)
MERGE the temporary table into the main table (eventually reaching 100s of millions rows)
The SQL (with a SORT MERGE hint) looks as follows (nothing fancy):
merge /*+ USE_MERGE(t s) */
into F_SCREEN_INSTANCE t
using F_SCREEN_INSTANCE_BUF s
on (s.DAY_ID = t.DAY_ID and s.PARTIAL_ID = t.PARTIAL_ID)
when matched then update set
t.ACTIVE_TIME_SUM = t.ACTIVE_TIME_SUM + s.ACTIVE_TIME_SUM,
t.IDLE_TIME_SUM = t.IDLE_TIME_SUM + s.IDLE_TIME_SUM
when not matched then insert values (
s.DAY_ID, s.PARTIAL_ID, s.ID, s.AGENT_USER_ID, s.COMPUTER_ID, s.RAW_APPLICATION_ID, s.APP_USER_ID, s.APPLICATION_ID, s.USER_ID, s.RAW_MODULE_ID, s.MODULE_ID, s.START_TIME, s.RAW_SCREEN_NAME, s.SCREEN_ID, s.SCREEN_TYPE, s.ACTIVE_TIME_SUM, s.IDLE_TIME_SUM)
The F_SCREEN_INSTANCE table has (DAY_ID, PARTIAL_ID) as a primary key and also is IOT (index organized table). This makes it an ideal candidate for a merge join: the rows are physically sorted by the lookup key.
So far so good. I've started a benchmark and the initial times looked good, 10s for one merge. But after about an hour, the merges were taking about 4 min with heavy tempdb usage (4GB per merge). The query plan below shows that F_SCREEN_INSTANCE is re-sorted before the merge, even though the table is ideally sorted already. And of course, as the table grows even more tempdb will be needed and the whole approach falls apart.
OK, so why re-sort the table? It turns to be a limitation of the merge join implementation: the second table is always sorted.
If an index exists, then the database can avoid sorting the first data
set. However, the database always sorts the second data set,
regardless of indexes.
O...K, so then can I make the main table to be first and the buffer to be second? Nope, that's not possible either. No matter how I list the tables in the USE_MERGE hint, the source table is always first.
Finally, here is my question: Have I missed anything? Is it possible to make this SORT MERGE approach work?
Here are some more details addressing questions you might ask:
What Oracle version? 12c.
Have you tried HASH JOIN? Yes, it's bad, as expected. The main table needs to be scanned in order to build the hash table. It can't scale as F_SCREEN_INSTANCE grows.
Have you tried LOOP JOIN? Yes, it's also bad. Considering the size of the buffer table, 100K lookups into F_SCREEN_INSTANCE take unreasonably long. Merges took about 3 min very quickly.
All in all, the MERGE JOIN is conceptually the best access strategy, but the Oracle implementation seems to be severely crippled by re-sorting the target table.
Sort merge outer joins will always put the outer-joined table second regardless of the hints. Adding an extra inner-join allows control of the join order, and then ROWID can be used to join again to the large table. Hopefully two good joins will work better than one bad join.
Assumptions
This answer assumes that the sort merge join is the fastest join, and that the manual is correct that the second data set is always sorted. It would be difficult to test these assumptions without significantly more information about the data.
Sample Schema
Here are some similar tables, with fake statistics to make the optimizer think they have 500M rows and 100K rows.
create table F_SCREEN_INSTANCE(DAY_ID number, PARTIAL_ID number, ID number, AGENT_USER_ID number,COMPUTER_ID number, RAW_APPLICATION_ID number, APP_USER_ID number, APPLICATION_ID number, USER_ID number, RAW_MODULE_ID number,MODULE_ID number, START_TIME date, RAW_SCREEN_NAME varchar2(100), SCREEN_ID number, SCREEN_TYPE number, ACTIVE_TIME_SUM number, IDLE_TIME_SUM number,
constraint f_screen_instance_pk primary key (day_id, partial_id)
) organization index;
create table F_SCREEN_INSTANCE_BUF(DAY_ID number, PARTIAL_ID number, ID number, AGENT_USER_ID number,COMPUTER_ID number, RAW_APPLICATION_ID number, APP_USER_ID number,APPLICATION_ID number, USER_ID number, RAW_MODULE_ID number, MODULE_ID number, START_TIME date, RAW_SCREEN_NAME varchar2(100), SCREEN_ID number, SCREEN_TYPE number, ACTIVE_TIME_SUM number, IDLE_TIME_SUM number,
constraint f_screen_instance_buf_pk primary key (day_id, partial_id)
);
begin
dbms_stats.set_table_stats(user, 'F_SCREEN_INSTANCE', numrows => 500000000);
dbms_stats.set_table_stats(user, 'F_SCREEN_INSTANCE_BUF', numrows => 100000);
end;
/
The Problem
The desired join and join order can be achieved with the LEADING hint when an inner join is used. The smaller table, F_SCREEN_INSTANCE_BUF, is the second table.
explain plan for
select /*+ use_merge(t s) leading(t s) */ *
from f_screen_instance_buf s
join f_screen_instance t
on (s.DAY_ID = t.DAY_ID and s.PARTIAL_ID = t.PARTIAL_ID);
select * from table(dbms_xplan.display(format => '-predicate'));
Plan hash value: 563239985
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 19M| | 6898 (66)| 00:00:01 |
| 1 | MERGE JOIN | | 100K| 19M| | 6898 (66)| 00:00:01 |
| 2 | INDEX FULL SCAN | F_SCREEN_INSTANCE_PK | 500M| 46G| | 4504 (100)| 00:00:01 |
| 3 | SORT JOIN | | 100K| 9765K| 26M| 2393 (1)| 00:00:01 |
| 4 | TABLE ACCESS FULL| F_SCREEN_INSTANCE_BUF | 100K| 9765K| | 34 (6)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
The LEADING hint does not work when changing to a left join.
explain plan for
select /*+ use_merge(t s) leading(t s) */ *
from f_screen_instance_buf s
left join f_screen_instance t
on (s.DAY_ID = t.DAY_ID and s.PARTIAL_ID = t.PARTIAL_ID);
select * from table(dbms_xplan.display(format => '-predicate'));
Plan hash value: 1472690071
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 19M| | 16M (1)| 00:10:34 |
| 1 | MERGE JOIN OUTER | | 100K| 19M| | 16M (1)| 00:10:34 |
| 2 | TABLE ACCESS BY INDEX ROWID| F_SCREEN_INSTANCE_BUF | 100K| 9765K| | 826 (0)| 00:00:01 |
| 3 | INDEX FULL SCAN | F_SCREEN_INSTANCE_BUF_PK | 100K| | | 26 (0)| 00:00:01 |
| 4 | SORT JOIN | | 500M| 46G| 131G| 16M (1)| 00:10:34 |
| 5 | INDEX FAST FULL SCAN | F_SCREEN_INSTANCE_PK | 500M| 46G| | 2703 (100)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------
This limitation is not documented as far as I can tell. I tried using the +outline setting of DBMS_XPLAN to see the full set of hints and then changed them around. But nothing I did could make the join order change for the LEFT JOIN version. Perhaps someone else can get this to work.
select * from table(dbms_xplan.display(format => '-predicate +outline'));
...
Outline Data
-------------
/*+
BEGIN_OUTLINE_DATA
USE_MERGE(#"SEL$0E991E55" "T"#"SEL$1")
LEADING(#"SEL$0E991E55" "S"#"SEL$1" "T"#"SEL$1")
INDEX_FFS(#"SEL$0E991E55" "T"#"SEL$1" ("F_SCREEN_INSTANCE"."DAY_ID" "F_SCREEN_INSTANCE"."PARTIAL_ID"))
INDEX(#"SEL$0E991E55" "S"#"SEL$1" ("F_SCREEN_INSTANCE_BUF"."DAY_ID"
"F_SCREEN_INSTANCE_BUF"."PARTIAL_ID"))
OUTLINE(#"SEL$9EC647DD")
OUTLINE(#"SEL$2")
MERGE(#"SEL$9EC647DD")
OUTLINE_LEAF(#"SEL$0E991E55")
ALL_ROWS
DB_VERSION('12.1.0.1')
OPTIMIZER_FEATURES_ENABLE('12.1.0.1')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
Possible Solution
--#3: Join the large table to the smaller result set. This uses the largest table twice,
--but the plan can use the ROWID for a very quick join.
explain plan for
merge into F_SCREEN_INSTANCE t
using
(
--#2: Now get the missing rows with an outer join. Since the _BUF table is
--small I assume it does not make a big difference exactly how it it joind
--to the 100K result set.
--The hints NO_MERGE and NO_PUSH_PRED are required to keep the INNER_JOIN
--inline view intact.
select /*+ no_merge(inner_join) no_push_pred(inner_join) */ inner_join.*
from f_screen_instance_buf s
left join
(
--#1: Get 100K rows efficiently with an inner join.
--Note that the ROWID is retrieved here.
select /*+ use_merge(t s) leading(t s) */ s.*, s.rowid s_rowid
from f_screen_instance_buf s
join f_screen_instance t
on (s.DAY_ID = t.DAY_ID and s.PARTIAL_ID = t.PARTIAL_ID)
) inner_join
on (s.DAY_ID = inner_join.DAY_ID and s.PARTIAL_ID = inner_join.PARTIAL_ID)
) s
on (s.s_rowid = t.rowid)
when matched then update set
t.ACTIVE_TIME_SUM = t.ACTIVE_TIME_SUM + s.ACTIVE_TIME_SUM,
t.IDLE_TIME_SUM = t.IDLE_TIME_SUM + s.IDLE_TIME_SUM
when not matched then insert values (
s.DAY_ID, s.PARTIAL_ID, s.ID, s.AGENT_USER_ID, s.COMPUTER_ID, s.RAW_APPLICATION_ID, s.APP_USER_ID, s.APPLICATION_ID, s.USER_ID, s.RAW_MODULE_ID, s.MODULE_ID, s.START_TIME, s.RAW_SCREEN_NAME, s.SCREEN_ID, s.SCREEN_TYPE, s.ACTIVE_TIME_SUM, s.IDLE_TIME_SUM);
It ain't pretty, but at least it generates a plan with the large table first in the sort merge join.
select * from table(dbms_xplan.display);
Plan hash value: 1086560566
-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 500G| 173T| | 5355K (43)| 00:03:30 |
| 1 | MERGE | F_SCREEN_INSTANCE | | | | | |
| 2 | VIEW | | | | | | |
|* 3 | HASH JOIN OUTER | | 500G| 179T| 29M| 5355K (43)| 00:03:30 |
|* 4 | HASH JOIN OUTER | | 100K| 28M| 3712K| 8663 (53)| 00:00:01 |
| 5 | INDEX FAST FULL SCAN| F_SCREEN_INSTANCE_BUF_PK | 100K| 2539K| | 9 (0)| 00:00:01 |
| 6 | VIEW | | 100K| 25M| | 6898 (66)| 00:00:01 |
| 7 | MERGE JOIN | | 100K| 12M| | 6898 (66)| 00:00:01 |
| 8 | INDEX FULL SCAN | F_SCREEN_INSTANCE_PK | 500M| 12G| | 4504 (100)| 00:00:01 |
|* 9 | SORT JOIN | | 100K| 9765K| 26M| 2393 (1)| 00:00:01 |
| 10 | TABLE ACCESS FULL| F_SCREEN_INSTANCE_BUF | 100K| 9765K| | 34 (6)| 00:00:01 |
| 11 | INDEX FAST FULL SCAN | F_SCREEN_INSTANCE_PK | 500M| 46G| | 2703 (100)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("INNER_JOIN"."S_ROWID"=("T".ROWID(+)))
4 - access("S"."PARTIAL_ID"="INNER_JOIN"."PARTIAL_ID"(+) AND
"S"."DAY_ID"="INNER_JOIN"."DAY_ID"(+))
9 - access("S"."DAY_ID"="T"."DAY_ID" AND "S"."PARTIAL_ID"="T"."PARTIAL_ID")
filter("S"."PARTIAL_ID"="T"."PARTIAL_ID" AND "S"."DAY_ID"="T"."DAY_ID")

Oracle simple Select query optimization

I have below simple dynamic select query
Select RELATIONSHIP
from DIME_MASTER
WHERE CIN=? AND SSN=? AND ACCOUNT_NUMBER=?
The table has 1,083,701 records. This query takes 11 to 12 secs to execute which is expensive. DIME_MASTER table has ACCOUNT, CARD_NUMBER INDEXES. Please help me to optimize this query so that query execution time is under fraction of second.
Look at the predicate information:
--------------------------------------
1 - filter(TO_NUMBER("DIME_MASTER"."SSN")=226550956
AND TO_NUMBER("DIME_MASTER"."ACCOUNT_NUMBER")=4425050005218650
AND TO_NUMBER("DIME_MASTER"."CIN")=00335093464)
The type of your columns is NVARCHAR, but parameters in the query are NUMBERs.
Oracle must cast numbers to strings, but it is sometimes not very smart in casting.
Oracles and fortune-tellers are not always right ;)
These casts prevents the query from using indices.
Rewrite the query using explicit conversion into:
Select RELATIONSHIP
from DIME_MASTER
WHERE CIN=to_char(?) AND SSN=to_char(?) AND ACCOUNT_NUMBER=to_char(?)
then run this command:
exec dbms_stats.gather_table_stats( user, 'DIME_MASTER' );
and run the query and show us a new explain plan.
Would you please do not paste explain plans here, they are unreadable,
please use pastebin instead, and paste only links here, thank you.
Look at this simple example, it shows why you need explicit casts:
CREATE TABLE "DIME_MASTER" (
"ACCOUNT_NUMBER" NVARCHAR2(16)
);
insert into dime_master
select round( dbms_random.value( 1, 100000 )) from dual
connect by level <= 100000;
commit;
create index dime_master_acc_ix on dime_master( account_number );
explain plan for select * from dime_master
where account_number = 123;
select * from table( dbms_xplan.display );
Plan hash value: 1551952897
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 54 | 70 (3)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| DIME_MASTER | 3 | 54 | 70 (3)| 00:00:01 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_NUMBER("ACCOUNT_NUMBER")=123)
explain plan for select * from dime_master
where account_number = to_char( 123 );
select * from table( dbms_xplan.display );
Plan hash value: 3367829596
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 54 | 1 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| DIME_MASTER_ACC_IX | 3 | 54 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("ACCOUNT_NUMBER"=U'123')
Depending on the cardinality of the columns (Total rows / unique values ) - you can create bitmap indexes on each column. Bitmap indexes are very usefull for and / or operations.
Rule of thumb says that a bitmap index is useful for cardinality of more then 10%.
create bitmap index DIME_MASTER_CIN_BIX on DIME_MASTER (CIN);

oracle faster paging query

I have two paging query that I consider to use.
First one is
SELECT * FROM ( SELECT rownum rnum, a.* from (
select * from members
) a WHERE rownum <= #paging.endRow# ) where rnum > #paging.startRow#
And the Second is
SELECT * FROM ( SELECT rownum rnum, a.* from (
select * from members
) a ) WHERE rnum BETWEEN #paging.startRow# AND #paging.endRow#
how do you think which query is the faster one?
I don't actually have availability of Oracle now but the best SQL query for paging is the following for sure
select *
from (
select rownum as rn, a.*
from (
select *
from my_table
order by ....a_unique_criteria...
) a
)
where rownum <= :size
and rn > (:page-1)*:size
http://www.oracle.com/technetwork/issue-archive/2006/06-sep/o56asktom-086197.html
To achieve a consistent paging you should order rows using a unique criteria, doing so will avoid to load for page X a row you already loaded for a page Y ( !=X ).
EDIT:
1) Order rows using a unique criteria means to order data in way that each row will keep the same position at every execution of the query
2) An index with all the expressions used on the ORDER BY clause will help getting results faster, expecially for the first pages. With that index the execution plan choosen by the optimizer doesn't needs to sort the rows because it will return rows scrolling the index by its natural order.
3) By the way, the fastests way to page result from a query is to execute the query only once and to handle all the flow from the application side.
Take a look at the execution plans, example with 1000 rows:
SELECT *
FROM (SELECT ROWNUM rnum
,a.*
FROM (SELECT *
FROM members) a
WHERE ROWNUM <= endrow#)
WHERE rnum > startrow#;
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 39000 | 3 (0)| 00:00:01 |
|* 1 | VIEW | | 1000 | 39000 | 3 (0)| 00:00:01 |
| 2 | COUNT | | | | | |
|* 3 | FILTER | | | | | |
| 4 | TABLE ACCESS FULL| MEMBERS | 1000 | 26000 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RNUM">"STARTROW#")
3 - filter("MEMBERS"."ENDROW#">=ROWNUM)
And 2.
SELECT *
FROM (SELECT ROWNUM rnum
,a.*
FROM (SELECT *
FROM members) a)
WHERE rnum BETWEEN startrow# AND endrow#;
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 39000 | 3 (0)| 00:00:01 |
|* 1 | VIEW | | 1000 | 39000 | 3 (0)| 00:00:01 |
| 2 | COUNT | | | | | |
| 3 | TABLE ACCESS FULL| MEMBERS | 1000 | 26000 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RNUM"<="ENDROW#" AND "RNUM">="STARTROW#")
Out of that I'd say version 2 could be slightly faster as it includes one step less. But I don't know about your indexes and data distribution so it's up to you to get these execution plans yourself and judge the situation for your data. Or simply test it.
A already answered in here But let me copypaste.
Just want to summarize the answers and comments. There are a number of ways doing a pagination.
Prior to oracle 12c there were no OFFSET/FETCH functionality, so take a look at whitepaper as the #jasonk suggested. It's the most complete article I found about different methods with detailed explanation of advantages and disadvantages. It would take a significant amount of time to copy-paste them here, so I want do it.
There is also a good article from jooq creators explaining some common caveats with oracle and other databases pagination. jooq's blogpost
Good news, since oracle 12c we have a new OFFSET/FETCH functionality. OracleMagazine 12c new features. Please refer to "Top-N Queries and Pagination"
You may check your oracle version by issuing the following statement
SELECT * FROM V$VERSION

Resources