Intro
I am writing a query to compare 2 different oracle database tables containing hashes.
I want to find which hashes from location 1, have not been migrated over to location 2.
I have three different queries, and have written explain plan for statements for them.
However, the results don't tell me that much.
Question
How do I find which is the most efficient and fastest?
Current Guess
My suspicion however is that the first query is the fastest since it makes a one-off use of the remote link. But this is just a guess that is not supported by actual results.
Code
--------------------------
EXPLAIN PLAN
SET statement_id = 'ex_plan1' FOR
select* from document doc left outer join migrated_document#V2_PROD migrated on doc.hash = migrated.document_hash AND migrated.document_hash is null ;
SELECT PLAN_TABLE_OUTPUT
FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'ex_plan1','BASIC'));
---------------------------------------------------
| Id | Operation | Name |
---------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | HASH JOIN RIGHT OUTER| |
| 2 | REMOTE | MIGRATED_DOCUMENT |
| 3 | TABLE ACCESS FULL | DOCUMENT |
---------------------------------------------------
--------------------------
EXPLAIN PLAN
SET statement_id = 'ex_plan2' FOR
select* from document doc where not exists( select 1 from migrated_document#V2_PROD migrated where migrated.document_hash = doc.HASH ) ;
SELECT PLAN_TABLE_OUTPUT
FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'ex_plan2','BASIC'));
------------------------------------------------
| Id | Operation | Name |
------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | FILTER | |
| 2 | TABLE ACCESS FULL| DOCUMENT |
| 3 | REMOTE | MIGRATED_DOCUMENT |
------------------------------------------------
--------------------------
EXPLAIN PLAN
SET statement_id = 'ex_plan3' FOR
select* from document doc where doc.hash not in ( select migrated.document_hash from migrated_document#V2_PROD migrated) ;
SELECT PLAN_TABLE_OUTPUT
FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'ex_plan3','BASIC'));
------------------------------------------------
| Id | Operation | Name |
------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | NESTED LOOPS ANTI | |
| 2 | TABLE ACCESS FULL| DOCUMENT |
| 3 | REMOTE | MIGRATED_DOCUMENT |
------------------------------------------------
--------------------------
Update
I updated the explain plan statements to get more results.
Due to the remote operation... could something cost less but be more slow?
If I am reading the data correctly it seems that option 2 is best.
But I still think option 1 is quicker.
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Inst |IN-OUT|
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 105K| 51M| 194 | | |
| 1 | HASH JOIN RIGHT OUTER| | 105K| 51M| 194 | | |
| 2 | REMOTE | MIGRATED_DOCUMENT | 1 | 275 | 2 | V2_MN~ | R->S |
| 3 | TABLE ACCESS FULL | DOCUMENT | 105K| 23M| 192 | | |
-------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Inst |IN-OUT|
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 105K| 23M| 104K| | |
| 1 | FILTER | | | | | | |
| 2 | TABLE ACCESS FULL| DOCUMENT | 105K| 23M| 192 | | |
| 3 | REMOTE | MIGRATED_DOCUMENT | 1 | 50 | 1 | V2_MN~ | R->S |
----------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Inst |IN-OUT|
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 105K| 29M| 526 | | |
| 1 | NESTED LOOPS ANTI | | 105K| 29M| 526 | | |
| 2 | TABLE ACCESS FULL| DOCUMENT | 105K| 23M| 192 | | |
| 3 | REMOTE | MIGRATED_DOCUMENT | 1 | 50 | 0 | V2_MN~ | R->S |
----------------------------------------------------------------------------------------
In a perfect world, you would choose the plan with the lowest cost. However I do not think your first query does what you want. It looks to me like no rows will join; you should be using a filter predicate rather than a join.
Instead of
select * from document doc
left outer join migrated_document#V2_PROD migrated
on doc.hash = migrated.document_hash
AND migrated.document_hash is null
it should be
select * from document doc
left outer join migrated_document#V2_PROD migrated
on doc.hash = migrated.document_hash
WHERE migrated.document_hash is null
Related
My table has millions of records. In this query below, can I make Oracle 12c examine the first X rows only instead of doing a full table scan?
The value of X, I imagine should be Offset + Fetch Next , so in this case 15
SELECT * FROM table OFFSET 5 ROWS FETCH NEXT 10 ROWS ONLY;
Thanks in advance
Edit 1
These are the tables involved and this is the actual query
Orders - This table has 113k records in my test DB ( and over 8 million in prod db like my original question mentioned)
--------------------------
| Id | SKUField1|SKUField2|
--------------------------
| 1 | Value1 | Value2 |
| 2 | Value2 | Value2 |
| 3 | Value1 | Value3 |
--------------------------
Products - This table has 2 million records in my test DB ( prod db is similar)
---------------
| PId| SKU_NUM|
---------------
| 1 | Value1 |
| 2 | Value2 |
| 3 | Value3 |
---------------
Note that values of Orders.SKUField1 and Orders.SKUField2 come from the Products.SKU_NUM values
Actual Query:
SELECT /*+ gather_plan_statistics */ Id, PId, SKUField1, SKUField2, SKU_NUM
FROM Orders
LEFT JOIN (
-- this inner query reduces size of Products from 2 million rows down to 1462 rows
select * from Products where SKU_NUM in (
select SKUField1 from Orders
)
) p1 ON SKUField1 = p1.SKU_NUM
LEFT JOIN (
-- this inner query reduces size of table B from 2 million rows down to 459 rows
select * from Products where SKU_NUM in (
select SKUField2 from Orders
)
) p4 ON SKUField2 = p4.SKU_NUM
OFFSET 5 ROWS FETCH NEXT 10 ROWS ONLY
Execution Plan:
--------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Time | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.06 | 8013 | | | |
|* 1 | VIEW | | 1 | 00:00:01 | 10 |00:00:00.06 | 8013 | | | |
|* 2 | WINDOW NOSORT STOPKEY | | 1 | 00:00:01 | 15 |00:00:00.06 | 8013 | 27M| 1904K| |
|* 3 | HASH JOIN RIGHT OUTER | | 1 | 00:00:01 | 15 |00:00:00.06 | 8013 | 1162K| 1162K| 1344K (0)|
| 4 | VIEW | | 1 | 00:00:01 | 1462 |00:00:00.04 | 6795 | | | |
| 5 | NESTED LOOPS | | 1 | 00:00:01 | 1462 |00:00:00.04 | 6795 | | | |
| 6 | NESTED LOOPS | | 1 | 00:00:01 | 1462 |00:00:00.04 | 5333 | | | |
| 7 | SORT UNIQUE | | 1 | 00:00:01 | 1469 |00:00:00.04 | 3010 | 80896 | 80896 |71680 (0)|
| 8 | TABLE ACCESS FULL | Orders | 1 | 00:00:01 | 113K|00:00:00.02 | 3010 | | | |
|* 9 | INDEX UNIQUE SCAN | UIX_Product_SKU_NUM | 1469 | 00:00:01 | 1462 |00:00:00.01 | 2323 | | | |
| 10 | TABLE ACCESS BY INDEX ROWID | Products | 1462 | 00:00:01 | 1462 |00:00:00.01 | 1462 | | | |
|* 11 | HASH JOIN RIGHT OUTER | | 1 | 00:00:01 | 15 |00:00:00.02 | 1218 | 1142K| 1142K| 1335K (0)|
| 12 | VIEW | | 1 | 00:00:01 | 459 |00:00:00.02 | 1213 | | | |
| 13 | NESTED LOOPS | | 1 | 00:00:01 | 459 |00:00:00.02 | 1213 | | | |
| 14 | NESTED LOOPS | | 1 | 00:00:01 | 459 |00:00:00.02 | 754 | | | |
| 15 | SORT UNIQUE | | 1 | 00:00:01 | 462 |00:00:00.02 | 377 | 24576 | 24576 |22528 (0)|
| 16 | INDEX FAST FULL SCAN | Orders_SKUField2_IDX6 | 1 | 00:00:01 | 113K|00:00:00.01 | 377 | | | |
|* 17 | INDEX UNIQUE SCAN | UIX_Product_SKU_NUM | 462 | 00:00:01 | 459 |00:00:00.01 | 377 | | | |
| 18 | TABLE ACCESS BY INDEX ROWID| Products | 459 | 00:00:01 | 459 |00:00:00.01 | 459 | | | |
| 19 | TABLE ACCESS FULL | Orders | 1 | 00:00:01 | 15 |00:00:00.01 | 5 | | | |
--------------------------------------------------------------------------------------------------------------------------------------------------
Hence, based on the "A-Rows" column values for row Ids 8 and 16 in the execution plan, it seems like there are full table scans on the Orders table (though row id 16 atleast seems to be using an index). So my question is is it true that there is a full table scan on the orders table even though I am using Offset/Fetch Next
Although your FETCH clause may use a full table scan, Oracle will still only fetch the first X rows from the table.
In the following example, the "TABLE ACCESS FULL" operation does start to read the entire table, but it gets cutoff part of the way through by the "WINDOW NOSORT STOPKEY" operation. Not all full table scans actually scan the full table. You would see similar behavior if your code ended with WHERE ROWNUM <= 50.
CREATE TABLE some_table AS SELECT * FROM all_objects;
EXPLAIN PLAN FOR SELECT * FROM some_table OFFSET 5 ROWS FETCH NEXT 10 ROWS ONLY;
SELECT * FROM TABLE(dbms_xplan.display);
Plan hash value: 2559837639
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 15 | 7410 | 2 (0)| 00:00:01 |
|* 1 | VIEW | | 15 | 7410 | 2 (0)| 00:00:01 |
|* 2 | WINDOW NOSORT STOPKEY| | 15 | 2010 | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | SOME_TABLE | 15 | 2010 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=15 AND
"from$_subquery$_002"."rowlimit_$$_rownumber">5)
2 - filter(ROW_NUMBER() OVER ( ORDER BY NULL )<=15)
The performance implications get more complicated if you also want to order the results. If that is the case, you may want to post the full query and execution plan.
(EDIT: 2022-09-25)
Yes, there is a full table scan on the ORDERS table happening on line 8 of the execution plan. As you mentioned, you can look at the "A-rows" column to tell what's really happening.
But the third full table scan of ORDERS, on line 19, is not a "full" full table scan. The operation "WINDOW NOSORT STOPKEY" stops that full table scan as soon as the 15 necessary rows are read. So the FETCH syntax is helping at least a little.
Applying a FETCH to a query does not mean that every single table will be limited. Although, in your query, it does seem like there ought to be a way to reduce the full table scans. Perhaps an index on SKUField1 would help?
Since Oracle as I know don't provide something like limit or top you can created by yourself like the following:
what is happening here, the inner query gets all the first 10 records and the outer query get those, you can still use any clauses like where or order or any others
SELECT * FROM (
SELECT * FROM Customers WHERE CustomerID <= 10 ORDER BY CustomerID
)
The full article will be found about this topic here at Oracle-Fetch
I am using Online Oracle so you can try it from your end, please let me know if you still have a problem.
I have created a materialized view on SH2 but, when I run my explain plan statement I can't see the materialized view being used in the plan table output. I'm not sure if it's a more complex materialized view with additional key columns to join to other dimensions, so I'm a bit confused why the materialized view isn't being utilized as I'm refering to the SH2 prefix in my select query.
CREATE MATERIALIZED VIEW fweek_pscat_sales_mv
PCTFREE 5
BUILD IMMEDIATE
REFRESH COMPLETE
ENABLE QUERY REWRITE
AS
SELECT t.week_ending_day
, p.prod_subcategory
, sum(s.amount_sold) AS Money
, s.channel_id
, s.promo_id
FROM sales s
, times t
, products p
WHERE s.time_id = t.time_id
AND s.prod_id = p.prod_id
GROUP BY t.week_ending_day
, p.prod_subcategory
, s.channel_id
, s.promo_id;
CREATE BITMAP INDEX FW_PSC_S_MV_SUBCAT_BIX
ON fweek_pscat_sales_mv(prod_subcategory);
CREATE BITMAP INDEX FW_PSC_S_MV_CHAN_BIX
ON fweek_pscat_sales_mv(channel_id);
CREATE BITMAP INDEX FW_PSC_S_MV_PROMO_BIX
ON fweek_pscat_sales_mv(promo_id);
CREATE BITMAP INDEX FW_PSC_S_MV_WD_BIX
ON fweek_pscat_sales_mv(week_ending_day);
spool &data_dir.EXP_query_on_SH2_2.txt
alter session set query_rewrite_integrity = TRUSTED;
alter session set query_rewrite_enabled = TRUE;
set timing on
EXPLAIN PLAN FOR
SELECT t.week_ending_day
, p.prod_subcategory
, sum(s.amount_sold) AS Money
, s.channel_id
, s.promo_id
FROM SH2.sales s
, SH2.times t
, SH2.products p
WHERE s.time_id = t.time_id
AND s.prod_id = p.prod_id
GROUP BY t.week_ending_day
, p.prod_subcategory
, s.channel_id
, s.promo_id;
REM Now Let us Display the Output of the Explain Plan
SET pagesize 9999
set linesize 250
set markup html preformat on
select * from table(dbms_xplan.display());
set linesize 80
spool off
----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1016K| 60M| | 17365 (1)| 00:00:01 | | |
| 1 | HASH GROUP BY | | 1016K| 60M| 70M| 17365 (1)| 00:00:01 | | |
|* 2 | HASH JOIN | | 1016K| 60M| | 2178 (1)| 00:00:01 | | |
| 3 | VIEW | index$_join$_003 | 10000 | 224K| | 74 (0)| 00:00:01 | | |
|* 4 | HASH JOIN | | | | | | | | |
| 5 | INDEX FAST FULL SCAN | PRODUCTS_PK | 10000 | 224K| | 41 (0)| 00:00:01 | | |
| 6 | INDEX FAST FULL SCAN | PRODUCTS_PROD_SUBCAT_IX | 10000 | 224K| | 51 (0)| 00:00:01 | | |
|* 7 | HASH JOIN | | 1016K| 37M| | 2101 (1)| 00:00:01 | | |
| 8 | PART JOIN FILTER CREATE | :BF0000 | 1016K| 37M| | 2101 (1)| 00:00:01 | | |
| 9 | TABLE ACCESS FULL | TIMES | 1461 | 23376 | | 13 (0)| 00:00:01 | | |
| 10 | PARTITION RANGE JOIN-FILTER| | 1016K| 22M| | 2086 (1)| 00:00:01 |:BF0000|:BF0000|
| 11 | TABLE ACCESS FULL | SALES | 1016K| 22M| | 2086 (1)| 00:00:01 |:BF0000|:BF0000|
----------------------------------------------------------------------------------------------------------------------------------
Why would you expect reference to the MV from the baseline query. For that to happen Oracle would have to compare the query to all MVs to find a match. Even further it would require every query be compared to every MV. MVs are typically created to avoid running the baseline query by accessing the MV directly (that is why MVs are the stored results of a query). If you want the MV just select from it directly.
SELECT week_ending_day
, prod_subcategory
, Money
, channel_id
, promo_id
from fweek_pscat_sales_mv;
my below pagination query runs faster (2.5 sec) without order by .
if I use order by it get slower (180 sec).
Total number of record is only 90000
select * from(
select i.*,rownum rno from (
select opp.updat,nvl(s.name,c.vemail),s.name,c.vemail
from sfa_opportunities opp,sfa_company s, customer c
where opp.companyid = c.companyid(+)
and opp.custid = c.custid(+)
and opp.companyid = s.companyid(+)
and opp.sfacompid = s.sfacompid(+)
order by 2 asc, 1 asc
)i) where rno >= 1 and rno <= 30
I have given the explain plan below for reference.
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 97980 | 110M| | 14137 (1)| 00:03:18 |
|* 1 | VIEW | | 97980 | 110M| | 14137 (1)| 00:03:18 |
| 2 | COUNT | | | | | | |
| 3 | VIEW | | 97980 | 109M| | 14137 (1)| 00:03:18 |
| 4 | SORT ORDER BY | | 97980 | 6602K| 15M| 14137 (1)| 00:03:18 |
| 5 | NESTED LOOPS OUTER | | 97980 | 6602K| | 13137 (1)| 00:03:04 |
|* 6 | HASH JOIN RIGHT OUTER | | 97980 | 3635K| 1136K| 614 (1)| 00:00:09 |
| 7 | TABLE ACCESS FULL | SFA_COMPANY | 34851 | 714K| | 58 (0)| 00:00:01 |
| 8 | TABLE ACCESS FULL | SFA_OPPORTUNITIES | 97980 | 1626K| | 390 (1)| 00:00:06 |
|* 9 | TABLE ACCESS BY INDEX ROWID| CUSTOMER | 1 | 31 | | 1 (0)| 00:00:01 |
|* 10 | INDEX UNIQUE SCAN | PK_CUSTOMER_CUSTID | 1 | | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RNO"<=30 AND "RNO">=1)
6 - access("OPP"."COMPANYID"="S"."COMPANYID"(+) AND "OPP"."SFACOMPID"="S"."SFACOMPID"(+))
9 - filter("OPP"."COMPANYID"="C"."COMPANYID"(+))
10 - access("OPP"."CUSTID"="C"."CUSTID"(+))
You're sorting on nvl(s.name,c.vemail), then opp.updat. My guess is the NVL may prevent a lot of optimization, because Oracle can't tell what the value of that column is going to be without looking at every row in the joined result. You could try adding indexes on those three columns or a function based index on nvl(s.name,c.vemail).
Following query is taking around 45 seconds in oracle 11g
select count(cap.ISHIGH),ms.SID,ms.NUM from CDetail cap,MData ms
where cap.MDataID_FK=ms.MDataID_PK and trunc(cap.CREATEDTIME) between trunc(sysdate-10) and trunc(sysdate)
group by ms.SID,ms.NUM ;
explain plan :
-------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 766K| 32M| | 94421 (1)| 00:18:54 |
| 1 | HASH GROUP BY | | 766K| 32M| 41M| 94421 (1)| 00:18:54 |
|* 2 | HASH JOIN | | 766K| 32M| 21M| 85716 (1)| 00:17:09 |
| 3 | VIEW | VW_GBC_5 | 766K| 13M| | 73348 (1)| 00:14:41 |
| 4 | HASH GROUP BY | | 766K| 13M| 98M| 73348 (1)| 00:14:41 |
|* 5 | FILTER | | | | | | |
| 6 | TABLE ACCESS BY INDEX ROWID| CDetail | 3217K| 58M| | 63738 (1)| 00:12:45 |
|* 7 | INDEX RANGE SCAN | IDX_CPCTYDTLTRNCCRTDTM | 3365K| | | 14679 (1)| 00:02:57 |
| 8 | TABLE ACCESS FULL | MData | 871K| 22M| | 9665 (1)| 00:01:56 |
-------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ITEM_1"="MS"."MDataID_PK")
5 - filter(TRUNC(SYSDATE#!-10)<=TRUNC(SYSDATE#!))
7 - access(TRUNC(INTERNAL_FUNCTION("CREATEDTIME"))>=TRUNC(SYSDATE#!-10) AND
TRUNC(INTERNAL_FUNCTION("CREATEDTIME"))<=TRUNC(SYSDATE#!))
table MData contains around 900,000 rows and table CDetail contains 23,000,000 rows.
Should I introduce any new index or any other way to optimize the above query.
Edit 3. IDX_CPCTYDTLTRNCCRTDTM is a functional index on trunc(CREATEDTIME)
Edit :1
explain plan :for full table scan using hint /+full(Cdetail)/
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 780K| 33M| | 160K (2)| 00:32:01 |
| 1 | HASH GROUP BY | | 780K| 33M| 42M| 160K (2)| 00:32:01 |
|* 2 | HASH JOIN | | 780K| 33M| 22M| 151K (2)| 00:30:15 |
| 3 | VIEW | VW_GBC_5 | 780K| 13M| | 138K (2)| 00:27:46 |
| 4 | HASH GROUP BY | | 780K| 14M| 230M| 138K (2)| 00:27:46 |
|* 5 | FILTER | | | | | | |
|* 6 | TABLE ACCESS FULL| CDetail | 7521K| 136M| | 120K (2)| 00:24:02 |
| 7 | TABLE ACCESS FULL | MData | 890K| 22M| | 9666 (1)| 00:01:56 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ITEM_1"="MS"."MDataID_PK")
5 - filter(TRUNC(SYSDATE#!-10)<=TRUNC(SYSDATE#!))
6 - filter(TRUNC(INTERNAL_FUNCTION("CREATEDTIME"))>=TRUNC(SYSDATE#!-10) AND
TRUNC(INTERNAL_FUNCTION("CREATEDTIME"))<=TRUNC(SYSDATE#!))
Thank you for sharing an explain plan; thats a good start. The thing with an explain plan however is that it gives you estimates, not actuals. If you can, can you get a SQL Monitor report? This will show you the actual cardinality and show you where time is being spent in the query.
The date filter is expecting about 3M rows (ID's 6 an 7)? Is that accurate?
What is the definition of the IDX_CPCTYDTLTRNCCRTDTM index? Does it happen to be function based?
Just to validate my thinking, can you add the following hint, run the query and get the explain plan again.
select /*+ full( cap ) */ ...
I have a top N query that is giving me problems.
First of all, I have a query like the following:
select /*+ gather_plan_statistics */ * from
(
select rowid
from payer_subscription ps
where ps.subscription_status = :i_subscription_status
and ps.merchant_id = :merchant_id2
order by transaction_date desc
) where rownum <= :i_rowcount;
This query works well. It can very efficiently find me the top 10 rows for a massive data set, using an index on merchant_id, subscription_status, transaction_date.
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.01 | 4 |
|* 1 | COUNT STOPKEY | | 1 | | 10 |00:00:00.01 | 4 |
| 2 | VIEW | | 1 | 11 | 10 |00:00:00.01 | 4 |
|* 3 | INDEX RANGE SCAN DESCENDING| SODTEST2_IX | 1 | 100 | 10 |00:00:00.01 | 4 |
-------------------------------------------------------------------------------------------------------
As you can see the estimated actual rows at each stage are 10, which is correct.
Now, I have a requirement to get the top N records for a set of merchant_Ids, so if I change the query to include two merchant_ids, the performance tanks:
select /*+ gather_plan_statistics */ * from
(
select rowid
from payer_subscription ps
where ps.subscription_status = :i_subscription_status
and (ps.merchant_id = :merchant_id or
ps.merchant_id = :merchant_id2 )
order by transaction_date desc
) where rownum <= :i_rowcount;
----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.17 | 178 | | | |
|* 1 | COUNT STOPKEY | | 1 | | 10 |00:00:00.17 | 178 | | | |
| 2 | VIEW | | 1 | 200 | 10 |00:00:00.17 | 178 | | | |
|* 3 | SORT ORDER BY STOPKEY| | 1 | 200 | 10 |00:00:00.17 | 178 | 2048 | 2048 | 2048 (0)|
| 4 | INLIST ITERATOR | | 1 | | 42385 |00:00:00.10 | 178 | | | |
|* 5 | INDEX RANGE SCAN | SODTEST2_IX | 2 | 200 | 42385 |00:00:00.06 | 178 | | | |
----------------------------------------------------------------------------------------------------------------------------
Notice now that there are 42K rows coming out of the two index range scans - Oracle is no longer aborting the index range scan when it reaches 10 rows. What I thought would happen, is that Oracle would get at most 10 rows for each merchant_id, knowing that at most 10 rows are to be returned by the query. Then it would sort that 10 + 10 rows and output the top 10 based on the transaction date, but it refuses to do that.
Does anyone know how I can get the performance of the first query, when I need to pass a list of merchants into the query? I could probably get the performance using a union all, but the list of merchants is variable, and could be anywhere between 1 or 2 to several 100.
You can use --+ use_concat hint to make Oracle execute query as if it was a UNION ALL.
From documentation:
The USE_CONCAT hint instructs the optimizer to transform combined
OR-conditions in the WHERE clause of a query into a compound query
using the UNION ALL set operator. Without this hint, this
transformation occurs only if the cost of the query using the
concatenations is cheaper than the cost without them. The USE_CONCAT
hint overrides the cost consideration.
There are many cases where use_concat is ignored.
See: MOS Note: USE_CONCAT hint on different versions (Doc ID 259741.1)
I have had success in 10.2.0.4, 11.2.0.1 with OR_EXPAND where USE_CONCAT will not work.
/*+ OR_EXPAND( alias column_name ) */
Documented here:
http://www.hellodba.com/reader.php?ID=199&lang=EN
I'm not sure if this helps, but you cah try to replace the OR operator with IN:
and ps.merchant_id IN (:merchant_id, :merchant_id2)