I would like to know if creating an index in the field CREATION_DATE will make any improvement in this query
select TO_CHAR(CREATION_DATE, 'HH24') || ':mm' HOUR, count ('1') MESSAGE_HOUR
from t_hotel
where TO_CHAR(CREATION_DATE, 'dd/mm/yyyy') = TO_CHAR(SYSDATE, 'dd/mm/yyyy')
group by TO_CHAR(CREATION_DATE, 'HH24')
order by TO_CHAR(CREATION_DATE, 'HH24') ;
Yes, you can take advantage of a function-based index on CREATION_DATE column.
Take the following demo as an example where I have a table with a column of date type.
create table t_hotel (creation_date date);
insert into t_hotel select sysdate+0.5 from dual connect by level <=100;
100 rows affected
insert into t_hotel select sysdate+0.2 from dual connect by level <=100;
100 rows affected
explain plan for select TO_CHAR(CREATION_DATE, 'HH24') || ':mm' HOUR, count ('1') MESSAGE_HOUR
from t_hotel
where trunc(creation_date)= trunc(sysdate)
group by TO_CHAR(CREATION_DATE, 'HH24')
order by TO_CHAR(CREATION_DATE, 'HH24') ;
select * from table(dbms_xplan.display());
| PLAN_TABLE_OUTPUT |
| :----------------------------------------------------------------------------- |
| Plan hash value: 353888308 |
| |
| ------------------------------------------------------------------------------ |
| | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | |
| ------------------------------------------------------------------------------ |
| | 0 | SELECT STATEMENT | | 200 | 1800 | 4 (25)| 00:00:01 | |
| | 1 | SORT GROUP BY | | 200 | 1800 | 4 (25)| 00:00:01 | |
| |* 2 | TABLE ACCESS FULL| T_HOTEL | 200 | 1800 | 3 (0)| 00:00:01 | |
| ------------------------------------------------------------------------------ |
| |
| Predicate Information (identified by operation id): |
| --------------------------------------------------- |
| |
| 2 - filter(TRUNC(INTERNAL_FUNCTION("CREATION_DATE"))=TRUNC(SYSDATE#!) |
| ) |
| |
| Note |
| ----- |
| - dynamic sampling used for this statement (level=2) |
create index idx_cd on t_hotel(trunc(creation_date));
explain plan for select TO_CHAR(CREATION_DATE, 'HH24') || ':mm' HOUR, count ('1') MESSAGE_HOUR
from t_hotel
where trunc(creation_date)= trunc(sysdate)
group by TO_CHAR(CREATION_DATE, 'HH24')
order by TO_CHAR(CREATION_DATE, 'HH24') ;
select * from table(dbms_xplan.display());
| PLAN_TABLE_OUTPUT |
| :--------------------------------------------------------------------------------------- |
| Plan hash value: 2841908389 |
| |
| ---------------------------------------------------------------------------------------- |
| | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | |
| ---------------------------------------------------------------------------------------- |
| | 0 | SELECT STATEMENT | | 2 | 36 | 3 (34)| 00:00:01 | |
| | 1 | SORT GROUP BY | | 2 | 36 | 3 (34)| 00:00:01 | |
| | 2 | TABLE ACCESS BY INDEX ROWID| T_HOTEL | 2 | 36 | 2 (0)| 00:00:01 | |
| |* 3 | INDEX RANGE SCAN | IDX_CD | 1 | | 1 (0)| 00:00:01 | |
| ---------------------------------------------------------------------------------------- |
| |
| Predicate Information (identified by operation id): |
| --------------------------------------------------- |
| |
| 3 - access(TRUNC(INTERNAL_FUNCTION("CREATION_DATE"))=TRUNC(SYSDATE#!)) |
| |
| Note |
| ----- |
| - dynamic sampling used for this statement (level=2) |
dbfiddle here
1) where clause should be written differently.
where creation_date >= trunc(sysdate) and creation_date < trunc(sysdate+1)
2) You can create two index.
- normal create index idx_creation_date on t_hotel(creation_date); improve where clause.
- fucntion base index create index idx_creation_date_hh24 on t_hotel(TO_CHAR(CREATION_DATE, 'HH24') ); probably it will improve group by and orderby but i'm not sure.
Related
My table has millions of records. In this query below, can I make Oracle 12c examine the first X rows only instead of doing a full table scan?
The value of X, I imagine should be Offset + Fetch Next , so in this case 15
SELECT * FROM table OFFSET 5 ROWS FETCH NEXT 10 ROWS ONLY;
Thanks in advance
Edit 1
These are the tables involved and this is the actual query
Orders - This table has 113k records in my test DB ( and over 8 million in prod db like my original question mentioned)
--------------------------
| Id | SKUField1|SKUField2|
--------------------------
| 1 | Value1 | Value2 |
| 2 | Value2 | Value2 |
| 3 | Value1 | Value3 |
--------------------------
Products - This table has 2 million records in my test DB ( prod db is similar)
---------------
| PId| SKU_NUM|
---------------
| 1 | Value1 |
| 2 | Value2 |
| 3 | Value3 |
---------------
Note that values of Orders.SKUField1 and Orders.SKUField2 come from the Products.SKU_NUM values
Actual Query:
SELECT /*+ gather_plan_statistics */ Id, PId, SKUField1, SKUField2, SKU_NUM
FROM Orders
LEFT JOIN (
-- this inner query reduces size of Products from 2 million rows down to 1462 rows
select * from Products where SKU_NUM in (
select SKUField1 from Orders
)
) p1 ON SKUField1 = p1.SKU_NUM
LEFT JOIN (
-- this inner query reduces size of table B from 2 million rows down to 459 rows
select * from Products where SKU_NUM in (
select SKUField2 from Orders
)
) p4 ON SKUField2 = p4.SKU_NUM
OFFSET 5 ROWS FETCH NEXT 10 ROWS ONLY
Execution Plan:
--------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Time | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.06 | 8013 | | | |
|* 1 | VIEW | | 1 | 00:00:01 | 10 |00:00:00.06 | 8013 | | | |
|* 2 | WINDOW NOSORT STOPKEY | | 1 | 00:00:01 | 15 |00:00:00.06 | 8013 | 27M| 1904K| |
|* 3 | HASH JOIN RIGHT OUTER | | 1 | 00:00:01 | 15 |00:00:00.06 | 8013 | 1162K| 1162K| 1344K (0)|
| 4 | VIEW | | 1 | 00:00:01 | 1462 |00:00:00.04 | 6795 | | | |
| 5 | NESTED LOOPS | | 1 | 00:00:01 | 1462 |00:00:00.04 | 6795 | | | |
| 6 | NESTED LOOPS | | 1 | 00:00:01 | 1462 |00:00:00.04 | 5333 | | | |
| 7 | SORT UNIQUE | | 1 | 00:00:01 | 1469 |00:00:00.04 | 3010 | 80896 | 80896 |71680 (0)|
| 8 | TABLE ACCESS FULL | Orders | 1 | 00:00:01 | 113K|00:00:00.02 | 3010 | | | |
|* 9 | INDEX UNIQUE SCAN | UIX_Product_SKU_NUM | 1469 | 00:00:01 | 1462 |00:00:00.01 | 2323 | | | |
| 10 | TABLE ACCESS BY INDEX ROWID | Products | 1462 | 00:00:01 | 1462 |00:00:00.01 | 1462 | | | |
|* 11 | HASH JOIN RIGHT OUTER | | 1 | 00:00:01 | 15 |00:00:00.02 | 1218 | 1142K| 1142K| 1335K (0)|
| 12 | VIEW | | 1 | 00:00:01 | 459 |00:00:00.02 | 1213 | | | |
| 13 | NESTED LOOPS | | 1 | 00:00:01 | 459 |00:00:00.02 | 1213 | | | |
| 14 | NESTED LOOPS | | 1 | 00:00:01 | 459 |00:00:00.02 | 754 | | | |
| 15 | SORT UNIQUE | | 1 | 00:00:01 | 462 |00:00:00.02 | 377 | 24576 | 24576 |22528 (0)|
| 16 | INDEX FAST FULL SCAN | Orders_SKUField2_IDX6 | 1 | 00:00:01 | 113K|00:00:00.01 | 377 | | | |
|* 17 | INDEX UNIQUE SCAN | UIX_Product_SKU_NUM | 462 | 00:00:01 | 459 |00:00:00.01 | 377 | | | |
| 18 | TABLE ACCESS BY INDEX ROWID| Products | 459 | 00:00:01 | 459 |00:00:00.01 | 459 | | | |
| 19 | TABLE ACCESS FULL | Orders | 1 | 00:00:01 | 15 |00:00:00.01 | 5 | | | |
--------------------------------------------------------------------------------------------------------------------------------------------------
Hence, based on the "A-Rows" column values for row Ids 8 and 16 in the execution plan, it seems like there are full table scans on the Orders table (though row id 16 atleast seems to be using an index). So my question is is it true that there is a full table scan on the orders table even though I am using Offset/Fetch Next
Although your FETCH clause may use a full table scan, Oracle will still only fetch the first X rows from the table.
In the following example, the "TABLE ACCESS FULL" operation does start to read the entire table, but it gets cutoff part of the way through by the "WINDOW NOSORT STOPKEY" operation. Not all full table scans actually scan the full table. You would see similar behavior if your code ended with WHERE ROWNUM <= 50.
CREATE TABLE some_table AS SELECT * FROM all_objects;
EXPLAIN PLAN FOR SELECT * FROM some_table OFFSET 5 ROWS FETCH NEXT 10 ROWS ONLY;
SELECT * FROM TABLE(dbms_xplan.display);
Plan hash value: 2559837639
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 15 | 7410 | 2 (0)| 00:00:01 |
|* 1 | VIEW | | 15 | 7410 | 2 (0)| 00:00:01 |
|* 2 | WINDOW NOSORT STOPKEY| | 15 | 2010 | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | SOME_TABLE | 15 | 2010 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=15 AND
"from$_subquery$_002"."rowlimit_$$_rownumber">5)
2 - filter(ROW_NUMBER() OVER ( ORDER BY NULL )<=15)
The performance implications get more complicated if you also want to order the results. If that is the case, you may want to post the full query and execution plan.
(EDIT: 2022-09-25)
Yes, there is a full table scan on the ORDERS table happening on line 8 of the execution plan. As you mentioned, you can look at the "A-rows" column to tell what's really happening.
But the third full table scan of ORDERS, on line 19, is not a "full" full table scan. The operation "WINDOW NOSORT STOPKEY" stops that full table scan as soon as the 15 necessary rows are read. So the FETCH syntax is helping at least a little.
Applying a FETCH to a query does not mean that every single table will be limited. Although, in your query, it does seem like there ought to be a way to reduce the full table scans. Perhaps an index on SKUField1 would help?
Since Oracle as I know don't provide something like limit or top you can created by yourself like the following:
what is happening here, the inner query gets all the first 10 records and the outer query get those, you can still use any clauses like where or order or any others
SELECT * FROM (
SELECT * FROM Customers WHERE CustomerID <= 10 ORDER BY CustomerID
)
The full article will be found about this topic here at Oracle-Fetch
I am using Online Oracle so you can try it from your end, please let me know if you still have a problem.
I have several instances: dbA, dbB, dbC.., then i want to query data from local and link to other instance over dblink, and expected is l_ids will be send to remote site cause l_ids is ligh and table#dbA is numerous, just as followering:
DECLARE
out_row_data SYS_REFCURSOR;
l_ids UDT_TBL_NUMBER; -- TABLE OF NUMBER(10)
l_sql VARCHAR2(18000);
BEGIN
l_sql := 'SELECT * FROM tableX#dbA
WHERE column IN (SELECT * FROM TABLE(:l_ids));'
OPEN out_row_data FOR l_sql
USING l_ids;
END;/
But execution plan shows data of tableX fetched first, then join with table(l_ids).
Another try, using driving_site hint to specify it, but didn't work either:
DECLARE
out_row_data SYS_REFCURSOR;
l_ids UDT_TBL_NUMBER; -- TABLE OF NUMBER(10)
l_sql VARCHAR2(18000);
BEGIN
l_sql := 'SELECT /*+ MONITOR driving_site(X) */ * FROM TABLE(:l_ids) GIDS
LEFT JOIN tableX#dbA X ON GIDS.column_value = X.column;'
OPEN out_row_data FOR l_sql
USING l_ids;
END;/
I have no idea now, above statement all process successfully, but execution plan aren't as expected.
Can someone help me or need more info? :(
Update
I think if driving_site is work, the tableX REMOTE operation shouldn't exist.
Streamline dynamic sqltext:
l_sql := '
WITH matched_Y AS (
SELECT *
FROM tableY#dbB
), matched_X AS (
SELECT /*+ MONITOR no_merge(X) */
column1
,sum(column2)
FROM
(
SELECT /*+ MONITOR driving_site(RX) DRSITE */ * FROM
TABLE(:ids) GIDS
LEFT JOIN tableX#dbA RX
ON GIDS.column_value = RX.column1
) X
LEFT JOIN TABLE(:currency_table) ct
ON column3 >= ct.column3
GROUP BY column1
), matched_X2 AS (
SELECT *
FROM (matched_X)
GROUP BY column1
)
SELECT *
FROM matched_Y g
LEFT JOIN matched_X2 w ON g.column1 = w.column1
ORDER BY g.column4 DESC';
Execution plan
**---------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Inst |IN-OUT|
---------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | | 3141 (100)| | | |
| 1 | SORT ORDER BY | | 11790 | 9936K| 10M| 3141 (1)| 00:00:01 | | |
|* 2 | HASH JOIN RIGHT OUTER | | 11790 | 9936K| | 1001 (1)| 00:00:01 | | |
| 3 | VIEW | | 1 | 130 | | 60 (2)| 00:00:01 | | |
| 4 | HASH GROUP BY | | 1 | 96 | | 60 (2)| 00:00:01 | | |
| 5 | NESTED LOOPS OUTER | | 20 | 1920 | | 59 (0)| 00:00:01 | | |
| 6 | VIEW | | 1 | 94 | | 30 (0)| 00:00:01 | | |
|* 7 | FILTER | | | | | | | | |
|* 8 | HASH JOIN | | 1 | 109 | | 30 (0)| 00:00:01 | | |
| 9 | REMOTE | tableX| 1 | 107 | | 1 (0)| 00:00:01 | DBLIN~ | R->S |
| 10 | COLLECTION ITERATOR PICKLER FETCH| | 100 | 200 | | 29 (0)| 00:00:01 | | |
|* 11 | COLLECTION ITERATOR PICKLER FETCH | | 20 | 40 | | 29 (0)| 00:00:01 | | |
|* 12 | VIEW | | 11790 | 8439K| | 941 (1)| 00:00:01 | | |
| 13 | REMOTE | | | | | | | DBLIN~ | R->S |
---------------------------------------------------------------------------------------------------------------------------**
In a true distributed query, the optimization is done on the sending site. Because your local site may not have access to the CBO statistics on the remote site.
The driving_site hint forces query execution to be done at a different site than the initiating instance. This is done when the remote table is much larger than the local table and you want the work (join, sorting) done remotely to save the back-and-forth network traffic.
Your plan is behaving correctly, from the perspective of the driving site, and without proper knowledge of the filter operations done by the optimizer ( this part of the plan is not in your post ), I would suggest to try and get rid of the nested loops outer produced by the left join, to see whether a hash join can behave better.
WITH matched_Y AS (
SELECT /*+ materialize */ *
FROM tableY#dbB
), matched_X AS (
SELECT /*+ MONITOR no_merge(X) use_hash(ct,x) */
column1
,sum(column2)
FROM
(
SELECT /*+ MONITOR driving_site(RX) DRSITE */ * FROM
TABLE(:ids) GIDS
LEFT JOIN tableX#dbA RX
ON GIDS.column_value = RX.column1
) X
LEFT JOIN TABLE(:currency_table) ct
ON column3 >= ct.column3
GROUP BY column1
), matched_X2 AS (
SELECT *
FROM (matched_X)
GROUP BY column1
)
SELECT *
FROM matched_Y g
LEFT JOIN matched_X2 w ON g.column1 = w.column1
ORDER BY g.column4 DESC
my below pagination query runs faster (2.5 sec) without order by .
if I use order by it get slower (180 sec).
Total number of record is only 90000
select * from(
select i.*,rownum rno from (
select opp.updat,nvl(s.name,c.vemail),s.name,c.vemail
from sfa_opportunities opp,sfa_company s, customer c
where opp.companyid = c.companyid(+)
and opp.custid = c.custid(+)
and opp.companyid = s.companyid(+)
and opp.sfacompid = s.sfacompid(+)
order by 2 asc, 1 asc
)i) where rno >= 1 and rno <= 30
I have given the explain plan below for reference.
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 97980 | 110M| | 14137 (1)| 00:03:18 |
|* 1 | VIEW | | 97980 | 110M| | 14137 (1)| 00:03:18 |
| 2 | COUNT | | | | | | |
| 3 | VIEW | | 97980 | 109M| | 14137 (1)| 00:03:18 |
| 4 | SORT ORDER BY | | 97980 | 6602K| 15M| 14137 (1)| 00:03:18 |
| 5 | NESTED LOOPS OUTER | | 97980 | 6602K| | 13137 (1)| 00:03:04 |
|* 6 | HASH JOIN RIGHT OUTER | | 97980 | 3635K| 1136K| 614 (1)| 00:00:09 |
| 7 | TABLE ACCESS FULL | SFA_COMPANY | 34851 | 714K| | 58 (0)| 00:00:01 |
| 8 | TABLE ACCESS FULL | SFA_OPPORTUNITIES | 97980 | 1626K| | 390 (1)| 00:00:06 |
|* 9 | TABLE ACCESS BY INDEX ROWID| CUSTOMER | 1 | 31 | | 1 (0)| 00:00:01 |
|* 10 | INDEX UNIQUE SCAN | PK_CUSTOMER_CUSTID | 1 | | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RNO"<=30 AND "RNO">=1)
6 - access("OPP"."COMPANYID"="S"."COMPANYID"(+) AND "OPP"."SFACOMPID"="S"."SFACOMPID"(+))
9 - filter("OPP"."COMPANYID"="C"."COMPANYID"(+))
10 - access("OPP"."CUSTID"="C"."CUSTID"(+))
You're sorting on nvl(s.name,c.vemail), then opp.updat. My guess is the NVL may prevent a lot of optimization, because Oracle can't tell what the value of that column is going to be without looking at every row in the joined result. You could try adding indexes on those three columns or a function based index on nvl(s.name,c.vemail).
Intro
I am writing a query to compare 2 different oracle database tables containing hashes.
I want to find which hashes from location 1, have not been migrated over to location 2.
I have three different queries, and have written explain plan for statements for them.
However, the results don't tell me that much.
Question
How do I find which is the most efficient and fastest?
Current Guess
My suspicion however is that the first query is the fastest since it makes a one-off use of the remote link. But this is just a guess that is not supported by actual results.
Code
--------------------------
EXPLAIN PLAN
SET statement_id = 'ex_plan1' FOR
select* from document doc left outer join migrated_document#V2_PROD migrated on doc.hash = migrated.document_hash AND migrated.document_hash is null ;
SELECT PLAN_TABLE_OUTPUT
FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'ex_plan1','BASIC'));
---------------------------------------------------
| Id | Operation | Name |
---------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | HASH JOIN RIGHT OUTER| |
| 2 | REMOTE | MIGRATED_DOCUMENT |
| 3 | TABLE ACCESS FULL | DOCUMENT |
---------------------------------------------------
--------------------------
EXPLAIN PLAN
SET statement_id = 'ex_plan2' FOR
select* from document doc where not exists( select 1 from migrated_document#V2_PROD migrated where migrated.document_hash = doc.HASH ) ;
SELECT PLAN_TABLE_OUTPUT
FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'ex_plan2','BASIC'));
------------------------------------------------
| Id | Operation | Name |
------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | FILTER | |
| 2 | TABLE ACCESS FULL| DOCUMENT |
| 3 | REMOTE | MIGRATED_DOCUMENT |
------------------------------------------------
--------------------------
EXPLAIN PLAN
SET statement_id = 'ex_plan3' FOR
select* from document doc where doc.hash not in ( select migrated.document_hash from migrated_document#V2_PROD migrated) ;
SELECT PLAN_TABLE_OUTPUT
FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'ex_plan3','BASIC'));
------------------------------------------------
| Id | Operation | Name |
------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | NESTED LOOPS ANTI | |
| 2 | TABLE ACCESS FULL| DOCUMENT |
| 3 | REMOTE | MIGRATED_DOCUMENT |
------------------------------------------------
--------------------------
Update
I updated the explain plan statements to get more results.
Due to the remote operation... could something cost less but be more slow?
If I am reading the data correctly it seems that option 2 is best.
But I still think option 1 is quicker.
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Inst |IN-OUT|
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 105K| 51M| 194 | | |
| 1 | HASH JOIN RIGHT OUTER| | 105K| 51M| 194 | | |
| 2 | REMOTE | MIGRATED_DOCUMENT | 1 | 275 | 2 | V2_MN~ | R->S |
| 3 | TABLE ACCESS FULL | DOCUMENT | 105K| 23M| 192 | | |
-------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Inst |IN-OUT|
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 105K| 23M| 104K| | |
| 1 | FILTER | | | | | | |
| 2 | TABLE ACCESS FULL| DOCUMENT | 105K| 23M| 192 | | |
| 3 | REMOTE | MIGRATED_DOCUMENT | 1 | 50 | 1 | V2_MN~ | R->S |
----------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Inst |IN-OUT|
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 105K| 29M| 526 | | |
| 1 | NESTED LOOPS ANTI | | 105K| 29M| 526 | | |
| 2 | TABLE ACCESS FULL| DOCUMENT | 105K| 23M| 192 | | |
| 3 | REMOTE | MIGRATED_DOCUMENT | 1 | 50 | 0 | V2_MN~ | R->S |
----------------------------------------------------------------------------------------
In a perfect world, you would choose the plan with the lowest cost. However I do not think your first query does what you want. It looks to me like no rows will join; you should be using a filter predicate rather than a join.
Instead of
select * from document doc
left outer join migrated_document#V2_PROD migrated
on doc.hash = migrated.document_hash
AND migrated.document_hash is null
it should be
select * from document doc
left outer join migrated_document#V2_PROD migrated
on doc.hash = migrated.document_hash
WHERE migrated.document_hash is null
I am little confused over the execution plan of an Oracle query. This is in Oracle Enterprise Edition 11.2.0.1.0 on platform IBM AIX 6.1. I have a table TEST1 (1 million rows) and another table TEST2 (50,000 rows). Both tables have identical columns. There is a view created as a union of these 2 tables. I am firing a query on this view, with an indexed column in the WHERE clause. What I could find is that the index is not used and a full table scan is resulted. With a slight modification of the query, it started using the index. I am wondering how this particular change can result in the plan change.
Please find the complete DDL + DML below. I have given simplified example. Actual schema and requirements are bit more complex. In fact the query in question is dynamically constructed and executed by an OCI code generator. My intention here is not to get alternatives, but to really understand what could be the logical reasoning behind the plan change (between, I am an application programmer and not a database administrator). Your help is much appreciated.
DROP TABLE TEST1 CASCADE CONSTRAINTS ;
DROP TABLE TEST2 CASCADE CONSTRAINTS ;
CREATE TABLE TEST1
(
ID NUMBER(20) NOT NULL,
NAME VARCHAR2(40),
DAY NUMBER(20)
)
PARTITION BY RANGE (DAY)
(
PARTITION P001 VALUES LESS THAN (2),
PARTITION P002 VALUES LESS THAN (3),
PARTITION P003 VALUES LESS THAN (4),
PARTITION P004 VALUES LESS THAN (5),
PARTITION P005 VALUES LESS THAN (6),
PARTITION P006 VALUES LESS THAN (7),
PARTITION P007 VALUES LESS THAN (8),
PARTITION P008 VALUES LESS THAN (9),
PARTITION P009 VALUES LESS THAN (10),
PARTITION P010 VALUES LESS THAN (11),
PARTITION P011 VALUES LESS THAN (12),
PARTITION P012 VALUES LESS THAN (13),
PARTITION P013 VALUES LESS THAN (14),
PARTITION P014 VALUES LESS THAN (15),
PARTITION P015 VALUES LESS THAN (16),
PARTITION P016 VALUES LESS THAN (17),
PARTITION P017 VALUES LESS THAN (18),
PARTITION P018 VALUES LESS THAN (19),
PARTITION P019 VALUES LESS THAN (20),
PARTITION P020 VALUES LESS THAN (21),
PARTITION P021 VALUES LESS THAN (22),
PARTITION P022 VALUES LESS THAN (23),
PARTITION P023 VALUES LESS THAN (24),
PARTITION P024 VALUES LESS THAN (25),
PARTITION P025 VALUES LESS THAN (26),
PARTITION P026 VALUES LESS THAN (27),
PARTITION P027 VALUES LESS THAN (28),
PARTITION P028 VALUES LESS THAN (29),
PARTITION P029 VALUES LESS THAN (30),
PARTITION P030 VALUES LESS THAN (31)
) ;
CREATE INDEX IX_ID on TEST1 (ID) INITRANS 4 STORAGE(FREELISTS 16) LOCAL
(
PARTITION P001,
PARTITION P002,
PARTITION P003,
PARTITION P004,
PARTITION P005,
PARTITION P006,
PARTITION P007,
PARTITION P008,
PARTITION P009,
PARTITION P010,
PARTITION P011,
PARTITION P012,
PARTITION P013,
PARTITION P014,
PARTITION P015,
PARTITION P016,
PARTITION P017,
PARTITION P018,
PARTITION P019,
PARTITION P020,
PARTITION P021,
PARTITION P022,
PARTITION P023,
PARTITION P024,
PARTITION P025,
PARTITION P026,
PARTITION P027,
PARTITION P028,
PARTITION P029,
PARTITION P030
) ;
CREATE TABLE TEST2
(
ID NUMBER(20) PRIMARY KEY NOT NULL,
NAME VARCHAR2(40),
DAY NUMBER(20)
) ;
CREATE OR REPLACE VIEW TEST_V AS
SELECT
ID, NAME, DAY
FROM
TEST1
UNION
SELECT
ID, NAME, DAY
FROM
TEST2 ;
begin
for count in 1..1000000
loop
insert into test1 values(count, 'John', mod(count, 30) + 1) ;
end loop ;
end ;
/
begin
for count in 1000000..1050000
loop
insert into test2 values(count, 'Mary', mod(count, 30) + 1) ;
end loop ;
end ;
/
commit ;
set lines 300 ;
set pages 1000 ;
-- Actual query
explain plan for
SELECT Key FROM
(
WITH recs AS
(
SELECT * FROM TEST_V WHERE ID = 70000
)
(
SELECT 1 AS Key FROM recs WHERE NAME = 'John'
)
UNION
(
SELECT 2 AS Key FROM recs WHERE NAME = 'Mary'
)
) ;
select * from table(dbms_xplan.display()) ;
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | TempSpc | Cost (%CPU) | Time | Pstart | Pstop |
------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1611K | 4721K | | 13559 (1) | 00:02:43 | | |
| 1 | VIEW | | 1611K | 4721K | | 13559 (1) | 00:02:43 | | |
| 2 | TEMP TABLE TRANSFORMATION | | | | | | | | |
| 3 | LOAD AS SELECT | SYS_TEMP_0FD9D6610_34D3B6C | | | | | | | |
|* 4 | VIEW | TEST_V | 805K | 36M | | 10403 (1) | 00:02:05 | | |
| 5 | SORT UNIQUE | | 805K | 36M | 46M | 10403 (8) | 00:02:05 | | |
| 6 | UNION-ALL | | | | | | | | |
| 7 | PARTITION RANGE ALL | | 752K | 34M | | 721 (1) | 00:00:09 | 1 | 30 |
| 8 | TABLE ACCESS FULL | TEST1 | 752K | 34M | | 721 (1) | 00:00:09 | 1 | 30 |
| 9 | TABLE ACCESS FULL | TEST2 | 53262 | 2496K | | 68 (0) | 00:00:01 | | |
| 10 | SORT UNIQUE | | 1611K | 33M | 43M | 13559 (51) | 00:02:43 | | |
| 11 | UNION-ALL | | | | | | | | |
|* 12 | VIEW | | 805K | 16M | | 1429 (1) | 00:00:18 | | |
| 13 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6610_34D3B6C | 805K | 36M | | 1429 (1) | 00:00:18 | | |
|* 14 | VIEW | | 805K | 16M | | 1429 (1) | 00:00:18 | | |
| 15 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6610_34D3B6C | 805K | 36M | | 1429 (1) | 00:00:18 | | |
------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter("ID"=70000)
12 - filter("NAME"='John')
14 - filter("NAME"='Mary')
-- Modified query (only change is absence of outermost SELECT)
explain plan for
WITH recs AS
(
SELECT * FROM TEST_V WHERE ID = 70000
)
(
SELECT 1 AS Key FROM recs WHERE NAME = 'John'
)
UNION
(
SELECT 2 AS Key FROM recs WHERE NAME = 'Mary'
) ;
select * from table(dbms_xplan.display()) ;
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | Pstart | Pstop |
-----------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 88 | 6 (67) | 00:00:01 | | |
| 1 | TEMP TABLE TRANSFORMATION | | | | | | | |
| 2 | LOAD AS SELECT | SYS_TEMP_0FD9D6611_34D3B6C | | | | | | |
| 3 | VIEW | TEST_V | 2 | 96 | 4 (50) | 00:00:01 | | |
| 4 | SORT UNIQUE | | 2 | 96 | 4 (75) | 00:00:01 | | |
| 5 | UNION-ALL | | | | | | | |
| 6 | PARTITION RANGE ALL | | 1 | 48 | 1 (0) | 00:00:01 | 1 | 30 |
| 7 | TABLE ACCESS BY LOCAL INDEX ROWID | TEST1 | 1 | 48 | 1 (0) | 00:00:01 | 1 | 30 |
|* 8 | INDEX RANGE SCAN | IX_ID | 1 | | 1 (0) | 00:00:01 | 1 | 30 |
| 9 | TABLE ACCESS BY INDEX ROWID | TEST2 | 1 | 48 | 1 (0) | 00:00:01 | | |
|* 10 | INDEX UNIQUE SCAN | SYS_C001242692 | 1 | | 1 (0) | 00:00:01 | | |
| 11 | SORT UNIQUE | | 4 | 88 | 6 (67) | 00:00:01 | | |
| 12 | UNION-ALL | | | | | | | |
|* 13 | VIEW | | 2 | 44 | 2 (0) | 00:00:01 | | |
| 14 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6611_34D3B6C | 2 | 96 | 2 (0) | 00:00:01 | | |
|* 15 | VIEW | | 2 | 44 | 2 (0) | 00:00:01 | | |
| 16 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6611_34D3B6C | 2 | 96 | 2 (0) | 00:00:01 | | |
-----------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
8 - access("ID"=70000)
10 - access("ID"=70000)
13 - filter("NAME"='John')
15 - filter("NAME"='Mary')
quit ;
thanks & regards,
Reji
I can not reproduce this in 11.2.0.3, I don't think there is a logical explanation for this behavior other than: you hit a bug, that apparently is solved in 11.2.0.3.
One thing that jumped immediately in my eye is the lack of object statistics and - if your output was complete - the fact that OPTIMIZER_DYNAMIC_SAMPLING is set to 0. You could try to reproduce with OPTIMIZER_DYNAMIC_SAMPLING=2. In that case the dynamic sampler kicks in if the object statistics are missing. BTW: don't use this feature instead of correct optimizer statistics. More info about dynamic sampling Dynamic sampling and its impact on the Optimizer
In your - nice documented - question and script/test case you try to make use of append and nologging. This only works for bulk inserts, not for row inserts with values. What would happen is for every insert: push-up the highwater mark and dump a full block of data in the free block, in your case that would have only 1 row .... Luckily, the database ignores this instruction.
Before you fire SQL to a table, make sure that you give it optimizer statistics. This will certainly help your case.