When does Oracle index null column values? - oracle

I used to think that Oracle does not index a row when one of the column values is null.
Some simple experimentation shows this to be not the case. I was able to run some queries unexpectedly accessing only indexes even though some columns were nullable (which of course was a pleasant surprise).
A Google search led to some blogs with conflicting answers: I have read that a row gets indexed unless all indexed columns are null, and also that a row gets indexed unless the leading column value for the index is null.
So, in what cases does a row not enter an index? Is this Oracle version specific?

If any indexed column contains a non-null value that row will be indexed. As you can see in the following example only one row doesn't get indexed and that's the row which has NULL in both indexed columns. You can also see that Oracle definitely does index a row when the leading index column has a NULL value.
SQL> create table big_table as
2 select object_id as pk_col
3 , object_name as col_1
4 , object_name as col_2
5 from all_objects
6 /
Table created.
SQL> select count(*) from big_table
2 /
COUNT(*)
----------
69238
SQL> insert into big_table values (9999990, null, null)
2 /
1 row created.
SQL> insert into big_table values (9999991, 'NEW COL 1', null)
2 /
1 row created.
SQL> insert into big_table values (9999992, null, 'NEW COL 2')
2 /
1 row created.
SQL> select count(*) from big_table
2 /
COUNT(*)
----------
69241
SQL> create index big_i on big_table(col_1, col_2)
2 /
Index created.
SQL> exec dbms_stats.gather_table_stats(user, 'BIG_TABLE', cascade=>TRUE)
PL/SQL procedure successfully completed.
SQL> select num_rows from user_indexes where index_name = 'BIG_I'
2 /
NUM_ROWS
----------
69240
SQL> set autotrace traceonly exp
SQL>
SQL> select pk_col from big_table
2 where col_1 = 'NEW COL 1'
3 /
Execution Plan
----------------------------------------------------------
Plan hash value: 1387873879
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 60 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| BIG_TABLE | 2 | 60 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | BIG_I | 2 | | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("COL_1"='NEW COL 1')
SQL> select pk_col from big_table
2 where col_2 = 'NEW COL 2'
3 /
Execution Plan
----------------------------------------------------------
Plan hash value: 3993303771
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 60 | 176 (1)| 00:00:03 |
|* 1 | TABLE ACCESS FULL| BIG_TABLE | 2 | 60 | 176 (1)| 00:00:03 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("COL_2"='NEW COL 2')
SQL> select pk_col from big_table
2 where col_1 is null
3 and col_2 = 'NEW COL 2'
4 /
Execution Plan
----------------------------------------------------------
Plan hash value: 1387873879
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 53 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| BIG_TABLE | 1 | 53 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | BIG_I | 2 | | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("COL_1" IS NULL AND "COL_2"='NEW COL 2')
filter("COL_2"='NEW COL 2')
SQL> select pk_col from big_table
2 where col_1 is null
3 and col_2 is null
4 /
Execution Plan
----------------------------------------------------------
Plan hash value: 3993303771
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 53 | 176 (1)| 00:00:03 |
|* 1 | TABLE ACCESS FULL| BIG_TABLE | 1 | 53 | 176 (1)| 00:00:03 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("COL_1" IS NULL AND "COL_2" IS NULL)
SQL>
This example run on Oracle 11.1.0.6. But I'm pretty confident it holds true for all versions.

And in addition to APC's answer: when you want to index a NULL value, you can add a constant expression to the index.
Example:
SQL> select * from v$version where rownum = 1
2 /
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
1 rij is geselecteerd.
SQL> create table t (id,status,fill)
2 as
3 select level
4 , nullif(ceil((level-1)/1000),0)
5 , lpad('*',1000,'*')
6 from dual
7 connect by level <= 10000
8 /
Tabel is aangemaakt.
SQL> select status
2 , count(*)
3 from t
4 group by status
5 /
STATUS COUNT(*)
---------- ----------
1 1000
2 1000
3 1000
4 1000
5 1000
6 1000
7 1000
8 1000
9 1000
10 999
1
11 rijen zijn geselecteerd.
SQL> create index i_status on t(status)
2 /
Index is aangemaakt.
SQL> exec dbms_stats.gather_table_stats(user,'t',cascade=>true)
PL/SQL-procedure is geslaagd.
SQL> set autotrace traceonly
SQL> select *
2 from t
3 where status is null
4 /
1 rij is geselecteerd.
Uitvoeringspan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=201 Card=1 Bytes=1007)
1 0 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=201 Card=1 Bytes=1007)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
364 consistent gets
0 physical reads
0 redo size
1265 bytes sent via SQL*Net to client
242 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Please note the full table scan and the 364 consistent gets.
SQL> set autotrace off
SQL> create index i_status2 on t(status,1)
2 /
Index is aangemaakt.
SQL> set autotrace traceonly
SQL> select *
2 from t
3 where status is null
4 /
1 rij is geselecteerd.
Uitvoeringspan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=1 Bytes=1007)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=1 Card=1 Bytes=1007)
2 1 INDEX (RANGE SCAN) OF 'I_STATUS2' (INDEX) (Cost=1 Card=1)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
3 consistent gets
1 physical reads
0 redo size
1265 bytes sent via SQL*Net to client
242 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
And now it uses the index and has only 3 consistent gets.
Regards,
Rob.

In addition to APC's answer, NULLS are indexed in bitmap indexes.

Related

Oracle 12c to Oracle 19c Migration - Unusual behavior

I'm performing test before we migrate Oracle database from 12c to 19c.
I'm facing an unusual behavior, which can be explained with below example.
I've condensed it to reproduceable issue as below.
Sorry for making it very long post, I wanted to feed in all possible information.
If any further information is required, then I would be happy to provide that.
Oracle 12c & 19c versions are as below (from v$instance):
VERSION
12.1.0.2.0
VERSION VERSION_FULL
19.0.0.0.0 19.16.0.0.0
Sample Data
2 tables are as below
TAB1
COLUMN_NAME DATA_TYPE NULLABLE
COL1 VARCHAR2(20 BYTE) Yes
RUL_NO NUMBER(11,0) No
INP_DT TIMESTAMP(6) WITH LOCAL TIME ZONE No
TAB2
COLUMN_NAME DATA_TYPE NULLABLE
COL1 VARCHAR2(20 BYTE) No
COL6 NUMBER(11,0) No
COL7 VARCHAR2(5 BYTE) Yes
INP_DT TIMESTAMP(6) WITH LOCAL TIME ZONE No
Index on TAB2 -
create index tab2_IDX1 on tab2(col6);
create index tab2_IDX2 on tab2(col1);
Problem SQL
SELECT *
FROM tab1 t
WHERE (EXISTS (SELECT 1
FROM tab2 b
WHERE b.col6 = 1088609
AND NVL(t.col1, '<NULL>') = NVL(b.col1, '<NULL>'))
OR t.col1 IS NULL);
This sql returns 10 rows on 12c db, but none on 19c db which is causing regression on 19c side.
Here's the output when this sql is run in trace mode.
12c Trace
SQL> set autotrace traceonly
SQL> set linesize 200
SQL> set pagesize 1000
SQL> SELECT *
FROM tab1 t
WHERE (EXISTS (SELECT 1
FROM tab2 b
WHERE b.col6 = 1088609
AND NVL(t.col1, '<NULL>') = NVL(b.col1, '<NULL>'))
OR t.col1 IS NULL);
2 3 4 5 6 7
10 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 572408916
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 160 | 3 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL | TAB1 | 10 | 160 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS BY INDEX ROWID BATCHED| TAB2 | 1 | 15 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | TAB2_IDX3 | 1 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T"."COL1" IS NULL OR EXISTS (SELECT 0 FROM "TAB2" "B" WHERE
"B"."COL1"=NVL(:B1,'<NULL>') AND "B"."COL6"=1088609))
3 - filter("B"."COL6"=1088609)
4 - access("B"."COL1"=NVL(:B1,'<NULL>'))
Note
-----
- dynamic statistics used: dynamic sampling (level=4)
19c Trace
SQL> set autotrace traceonly
SQL> set linesize 200
SQL> set pagesize 1000
SQL>
SQL> SELECT *
FROM tab1 t
WHERE (EXISTS (SELECT 1
FROM tab2 b
WHERE b.col6 = 1088609
AND NVL(t.col1, '<NULL>') = NVL(b.col1, '<NULL>'))
OR t.col1 IS NULL);
2 3 4 5 6 7
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 4175419084
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 31 | 5 (0)| 00:00:01 |
|* 1 | HASH JOIN SEMI NA | | 1 | 31 | 5 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL | TAB1 | 10 | 160 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID BATCHED| TAB2 | 1 | 15 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | TAB2_IDX1 | 1 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access(NVL("T"."COL1",'<NULL>')="B"."COL1")
4 - access("B"."COL6"=1088609)
Note
-----
- this is an adaptive plan
Can somebody suggest why this behavior is observed in 19c, as it should return 10 rows like 12c db. It seems HASH JOIN SEMI NA step on 19c side is causing this issue, but I can't be sure.
Any help on this matter is very much appreciated.
Thanks,
Kailash
It seems that the 19c execution plan somehow loose the OR t.col1 IS NULL predicate in the Predicate Information
1 - access(NVL("T"."COL1",'<NULL>')="B"."COL1")
Which is most probably a bug (wrong predicate elimination??).
Anyway a workround (if you can change the query) seems to be to add the OR into the EXISTS subquery
SELECT *
FROM tab1 t
WHERE EXISTS (SELECT 1
FROM tab2 b
WHERE b.col6 = 1088609
AND NVL(t.col1, '<NULL>') = NVL(b.col1, '<NULL>')
OR t.col1 IS NULL );
This also implicitely disables the NA semi join back to the FILTER plan from 12c, which is an other sign that this is the cause of the wrong behaviour.
Open a SR with Oracle for a final solution!

index part of column with flexible number of character

I want to index part of a column with value like this : #aaa/453 . it means that the value in this column consist of four parts , symbol character / number .
in our query we just have Numerical section so we want to have an index on this part.
the number of character in each part is changeable .
please help me
Here's an example: table contains values similar to what you described.
SQL> create table test (col varchar2(20));
Table created.
SQL> insert into test
2 select '#aaa/453' from dual union all
3 select '$bcdxyz/35' from dual union all
4 select '#gf/203' from dual;
3 rows created.
In order to select rows from the table, one option is to use such a query:
SQL> select * from test
2 where regexp_substr(col, '\d+$') = '35';
COL
--------------------
$bcdxyz/35
So, let's create a function-based index:
SQL> create index i1test on test (regexp_substr(col, '\d+$'));
Index created.
What does the explain plan say?
SQL> explain plan for
2 select * from test
3 where regexp_substr(col, '\d+$') = '35';
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
Plan hash value: 210954056
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 54 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 54 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | I1TEST | 1 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access( REGEXP_SUBSTR ("COL",'\d+$')='35')
Note
-----
- dynamic sampling used for this statement (level=2)
18 rows selected.
SQL>
Looks like it might help. Try it, see how it behaves.

Oracle not use extra indexes when there's an ORDER BY

I'm playing with Oracle 12 and indexes...
In a query like this:
SELECT a, b, c FROM table WHERE col1 = val1 AND col2 = val2 ORDER BY id DESC
(where id is the primary key of the table), Oracle always uses the index on the primary key.
So even if I create an index on the columns col1 and col2, since there's the ORDER BY statement, it doesn't use the index.
So can I infer that this is a general rule? Should I never put extra indexes in case all my queries contains "ORDER BY ID" ?
Here is my table structure:
ID NUMBER GENERATED ALWAYS AS IDENTITY NOCACHE ORDER,
USERNAME VARCHAR2(30 CHAR)
TYPE_A CHAR(1 BYTE)
TYPE_B CHAR(1 BYTE)
CREATED DATE
UPDATED DATE
ALTER TABLE my_table
ADD CONSTRAINT my_table_pk
PRIMARY KEY (ID)
USING INDEX TABLESPACE XXX;
On the table I perform only this query:
SELECT id, USERNAME, TYPE_A, TYPE_B, CREATED FROM table
where username = 'MYUSER'
AND created >= TO_DATE('2016-01-01','YYYY-MM-DD')
AND created <= TO_DATE('2016-06-30','YYYY-MM-DD')
AND TYPE_A = 1
order by ID desc;
One index: on pk (ID) (automatically created by oracle)
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 384 | 1 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| table | 2 | 384 | 1 (0)| 00:00:01 |
| 2 | INDEX FULL SCAN DESCENDING| INDEX_PK | 10 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------
Two indexes: first on pk and second on (USERNAME, CREATED, TYPE_A)
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 384 | 1 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| table | 2 | 384 | 1 (0)| 00:00:01 |
| 2 | INDEX FULL SCAN DESCENDING| INDEX_PK | 10 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------
So the second index seems to be useless.
By the way If i remove the ORDER BY statement, Oracle uses the second index on USERNAME, CREATED, TYPE_A.
Thanks all!
Let me just give you a counterexample which shows that there are cases where Oracle will use the second index.
SQL> create table tab (
2 ID NUMBER GENERATED ALWAYS AS IDENTITY NOCACHE ORDER,
3 USERNAME VARCHAR2(30 CHAR),
4 TYPE_A CHAR(1 BYTE),
5 TYPE_B CHAR(1 BYTE),
6 CREATED DATE,
7 UPDATED DATE
8 )
9 /
Table created.
SQL> alter table tab add constraint tab_pk primary key (id) using index
2 /
Table altered.
SQL> create index SECOND_IDX on tab(username, created, type_a)
2 /
Index created.
SQL> insert into tab(username, type_a, type_b, created)
2 select 'OTHER_USER', '2', '2', date '2015-06-01'
3 from all_objects, all_objects where rownum <= 1e5;
100000 rows created.
SQL>
SQL> update tab
2 set username = 'MYUSER',
3 created = DATE '2016-06-01',
4 type_a = '1'
5 where id = 50000;
1 row updated.
SQL> commit;
Commit complete.
SQL> begin
2 dbms_stats.gather_table_stats(ownname => USER,
3 tabname => 'TAB',
4 estimate_percent => 100,
5 method_opt => 'FOR ALL INDEXED COLUMNS'
6 );
7 end;
8 /
PL/SQL procedure successfully completed.
SQL>
SQL> set autotrace traceonly exp
SQL>
SQL> SELECT id, USERNAME, TYPE_A, TYPE_B, CREATED FROM tab
2 where username = 'MYUSER'
3 AND created >= TO_DATE('2016-01-01','YYYY-MM-DD')
4 AND created <= TO_DATE('2016-06-30','YYYY-MM-DD')
5 AND TYPE_A = '1'
6 order by ID desc;
Execution Plan
----------------------------------------------------------
Plan hash value: 3658386757
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 29 | 5 (20)| 00:00:01 |
| 1 | SORT ORDER BY | | 1 | 29 | 5 (20)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID BATCHED| TAB | 1 | 29 | 4 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | SECOND_IDX | 1 | | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
In this case the reason to use the second index is the extremely high selectivity (one row out of 100000).
Well, in short answer - no, but we can't give you a general rule, because every time will be different as to a lot of different variable. For more specific answer you should include an explain plan of this query, and we'll have a better picture on why it doesn't use the index.
Oracle will know to use this index as long as ID column will be specified first .
You shouldn't add unnecessary indexes for selects that will only occur once in a lot of time or those that are slow, but not too slow. You should only add indexes related to the most common selects/updates that occurs on this table.
If a select with filters on col1 and col2 is repeatedly , then most likely(again, I don't know what other processes you are doing on this table) an index on all 3 columns will be better :
(ID,Col1,Col2)

using "or" in where clause for date type column issue when compare with sysdate in Oracle

I've just created a sample table with following data:
CREATE TABLE AAA ( DT DATE );
insert into aaa (DT) values (to_date('13-01-2013', 'dd-mm-yyyy'));
insert into aaa (DT) values (to_date('14-01-2013', 'dd-mm-yyyy'));
insert into aaa (DT) values (to_date('15-01-2013', 'dd-mm-yyyy'));
insert into aaa (DT) values (to_date('16-01-2013', 'dd-mm-yyyy'));
insert into aaa (DT) values (to_date('17-01-2013', 'dd-mm-yyyy'));
insert into aaa (DT) values (to_date('18-01-2013', 'dd-mm-yyyy'));
insert into aaa (DT) values (to_date('19-01-2013', 'dd-mm-yyyy'));
insert into aaa (DT) values (to_date('20-01-2013', 'dd-mm-yyyy'));
insert into aaa (DT) values (to_date('21-01-2013', 'dd-mm-yyyy'));
insert into aaa (DT) values (to_date('22-01-2013', 'dd-mm-yyyy'));
insert into aaa (DT) values (to_date('23-01-2013', 'dd-mm-yyyy'));
insert into aaa (DT) values (to_date('24-01-2013', 'dd-mm-yyyy'));
insert into aaa (DT) values (to_date('25-01-2013', 'dd-mm-yyyy'));
insert into aaa (DT) values (to_date('26-01-2013', 'dd-mm-yyyy'));
insert into aaa (DT) values (to_date('27-01-2013', 'dd-mm-yyyy'));
commit;
and then the following query, returns abnormal results: (15 records instead of 7)
select count(*) from aaa d
where
(d.dt > sysdate)
or
d.dt < to_date(20130120,'yyyymmdd')
but when I change left side and right side of "OR" it returns correct result: (7 records)
select count(*) from aaa d
where
d.dt < to_date(20130120,'yyyymmdd')
or
(d.dt > sysdate)
does anybody know what is this issue about and how to solve it?
add: replacing d.dt with d.dt+1 is also solving this problem,
d.dt+1 > sysdate+1
Well I am able to replicate it and the reason behind such behavior is Oracle's interpretation of predicates.
Version of OS and Oracle where this can be reproduced:
SQL> host ver
Microsoft Windows [Version 6.1.7601]
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for 64-bit Windows: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production
SQL>
In the first case predicate is modified to filter("D"."DT" IS NOT NULL)
while in the second query, predicate works as provided filter("D"."DT"<TO_DATE(' 2013-01-20 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR "D"."DT">SYSDATE#!)
SQL> select count(*)
2 from aaa d
3 where (d.dt > sysdate)
4 or d.dt < to_date('20130120','yyyymmdd')
5 /
COUNT(*)
----------
15
Execution Plan
----------------------------------------------------------
Plan hash value: 977873394
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 9 | | |
|* 2 | TABLE ACCESS FULL| AAA | 15 | 135 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("D"."DT" IS NOT NULL)
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
4 recursive calls
0 db block gets
15 consistent gets
0 physical reads
0 redo size
346 bytes sent via SQL*Net to client
364 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> ed
Wrote file afiedt.buf
1 select count(*)
2 from aaa d
3 where d.dt < to_date('20130120','yyyymmdd')
4* or (d.dt > sysdate)
SQL>
/
COUNT(*)
----------
7
Execution Plan
----------------------------------------------------------
Plan hash value: 977873394
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 9 | | |
|* 2 | TABLE ACCESS FULL| AAA | 7 | 63 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("D"."DT"<TO_DATE(' 2013-01-20 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') OR "D"."DT">SYSDATE#!)
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
4 recursive calls
0 db block gets
15 consistent gets
0 physical reads
0 redo size
346 bytes sent via SQL*Net to client
364 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
I could not figure out this behavior of Oracle, quite possible some experts can explain this.
Again in the third example, predicates are used correctly. filter("D"."DT"<TO_DATE(' 2013-01-20 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR INTERNAL_FUNCTION("D"."DT")+1>SYSDATE#!+1)
SQL> ed
Wrote file afiedt.buf
1 select count(*)
2 from aaa d
3 where (d.dt + 1 > sysdate + 1)
4* or d.dt < to_date('20130120','yyyymmdd')
SQL> /
COUNT(*)
----------
7
Execution Plan
----------------------------------------------------------
Plan hash value: 977873394
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 9 | | |
|* 2 | TABLE ACCESS FULL| AAA | 7 | 63 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("D"."DT"<TO_DATE(' 2013-01-20 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') OR INTERNAL_FUNCTION("D"."DT")+1>SYSDATE#!+1)
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
5 recursive calls
0 db block gets
15 consistent gets
0 physical reads
0 redo size
346 bytes sent via SQL*Net to client
364 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
Its quite obvious that the same cannot be reproduced from Oracle Version 11.2.0.2.0 and 11.2.0.3.0 on Linux servers.
Update:
As Alex Poole mentioned in the comments - "This might be bug 9495697, 'Wrong results may be returned for a query containing two OR'd filter predicates on the same column, where the other side of one predicate is not a compile-time constant (eg. It is a bind, sysdate, etc..)"

oracle merge performance

MERGE INTO table1 t1
USING
(SELECT column1
FROM table2 join table3 on...
WHERE table2.column5 = 'xyz') t2
ON (t1.column1 = t2.column1 AND t1.column2 = somevalue
AND t1.column3 = someothervalue)
WHEN not matched
THEN
INSERT ...
The "on" part will reject most of the rows, but does merge query all the rows that are inside "using" first? In that case that part will be running uselessly most of the times because 95% of the rows will not match t1.column2 = somevalue
AND t1.column3 = someothervalue. Or is Oracle smart enough to not do that?
Yes, Oracle will rewrite the query to join table1 to the using view query. so if t1.column2 = somevalue AND t1.column3 = someothervalue is selective and Oracle realises this, you should see in the plan that the query will drive from TABLE1 and then join into the tables in the using view. just run an explain plan to check it. i.e.
set linesize 200 pagesize 200
explain plan for
merge....;
select * from table(dbms_xplan.display());
and you should see that Oracle has done this for you.
eg:
SQL> create table table1(id number primary key, t2_id number, str varchar2(20), notes varchar2(20));
Table created.
SQL> create table table2(id number primary key, notes varchar2(20));
Table created.
SQL>
SQL> insert into table1
2 select rownum, rownum, case mod(rownum, 100) when 0 then 'ONE' else 'TWO' end, null
3 from dual connect by level <=1000000;
1000000 rows created.
SQL>
SQL> insert into table2
2 select rownum, dbms_random.string('x', 10)
3 from dual connect by level <=1000000;
1000000 rows created.
SQL>
SQL> create index table1_idx on table1(str);
Index created.
SQL> exec dbms_stats.gather_table_stats(user, 'TABLE1', method_opt=>'for all indexed columns size skewonly');
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_table_stats(user, 'TABLE2');
PL/SQL procedure successfully completed.
so ill add t1.str = 'ONE' when is very selective:
SQL> explain plan for
2 merge into table1 t1
3 using (select * from table2 t where t.id > 1000) t2
4 on (t2.id = t1.t2_id and t1.str = 'ONE')
5 when matched then update
6 set t1.notes = t2.notes;
Explained.
SQL> #explain ""
Plan hash value: 2050534005
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 441 | 11025 | 929 (5)| 00:00:12 |
| 1 | MERGE | TABLE1 | | | | |
| 2 | VIEW | | | | | |
|* 3 | HASH JOIN | | 441 | 12348 | 929 (5)| 00:00:12 |
|* 4 | TABLE ACCESS BY INDEX ROWID| TABLE1 | 441 | 5733 | 69 (2)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | TABLE1_IDX | 8828 | | 21 (0)| 00:00:01 |
|* 6 | TABLE ACCESS FULL | TABLE2 | 994K| 14M| 848 (4)| 00:00:11 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T"."ID"="T1"."T2_ID")
   4 - filter("T1"."T2_ID">1000)
   5 - access("T1"."STR"='ONE')
   6 - filter("T"."ID">1000)
you can see its applied the index on table1
INDEX RANGE SCAN | TABLE1_IDX
thus removing a lot of rows scanned on table1 (hash join was more appropriate considering my table, but in your case you may see a nested loop approach on the join step).

Resources