oracle index in the underlying sql query - oracle

I have a table with 2 columns
item_name varchar2
brand varchar2
both of them have bitmap index
let's say I create a view for a specific brand and rename the column item_name ,something like that
create view my_brand as
select item_name as item from table x where brand='x'
We cannot create an index on a normal view but what is Oracle doing when issuing the underlying query of that view? Is the index of the item_name column being used if we write select item from my_brand where item='item1'?
thanks

The answer will be “it depends”. The index access path is certainly an option open to the optimizer; but remember that the optimizer makes a cost based decision. So essentially it will evaluate the cost of all the available plans and choose the one with the lowest cost.
Here is an example:
create table tab1 ( item_name varchar2(15), brand varchar2(15) );
insert into tab1
select 'Name '||to_char( rownum), 'Brand '||to_char(mod(rownum,10))
from dual
connect by rownum < 1000000
commit;
exec dbms_stats.gather_table_stats( user, 'TAB1' );
create bitmap index bm1 on tab1 ( item_name );
create bitmap index bm2 on tab1 ( brand );
create or replace view my_brand
as select item_name as item from tab1 where brand = 'Brand 1';
explain plan for
select item from my_brand where item = 'Name 1001'
select * from table( dbms_xplan.display )
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 20 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| TAB1 | 1 | 20 | 3 (0)| 00:00:01 |
| 2 | BITMAP CONVERSION TO ROWIDS | | | | | |
|* 3 | BITMAP INDEX SINGLE VALUE | BM1 | | | | |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("BRAND"='Brand 1')
3 - access("ITEM_NAME"='Name 1001')

We can create an index on a normal view
No, you can't
SQL> create table as_idx_view (item_name varchar2(10), brand varchar2(10));
Table created.
SQL> create view as_view as select item_name item from as_idx_view where brand = 'X';
View created.
SQL> create index as_idx_view_idx1 on as_view (item);
create index as_idx_view_idx1 on as_view (item)
*
ERROR at line 1:
ORA-01702: a view is not appropriate here

Related

Oracle12 Materialized Views only latest records

Say you have a table of car owners (car_owners) :
car_id
person_id
registration_date
...
And each time someone buy's a car there is a record inserted into this table.
Now I would like to create a materialized view that only holds the newest registration for each vehicle, that is when a record is inserted, the materialized view updates the record for this vehicle (if it exists) with the new record from the base table.
The materialized view only hold one record per car.
I tried something like this
create materialized view newest_owner
build immediately
refresh force on commit
select *
from car_owners c
where c.registration_date = (
select max(cc.registration_date)
from car_owners cc
where cc.car_id = c.car_id
);
It seems that materialized view do not like sub-selects.
Do you have any tips on how to do this or how to achieve this another way?
I have another solution for now, use triggers to update a separate table to hold the newest values, but I was hoping that materialized view could do the trick.
Thanks.
For this, you may have to use nested materialized views:
create table car_owners
(pk_col number primary key
,car_id number
,person_id number
,registration_date date
);
truncate table car_owners;
insert into car_owners
select rownum
,trunc(dbms_random.value(1,1000)) car_id
,mod(rownum,100000) person_id
,(sysdate-dbms_random.value(1,3000)) registration_date
from dual
connect by rownum <= 1000000;
commit;
exec dbms_stats.gather_Table_stats(null,'car_owners')
create materialized view log on car_owners with sequence, rowid
(car_id, registration_date) including new values;
create materialized view latest_registration
refresh fast on commit enable query rewrite
as
select c.car_id
,max(c.registration_date) max_registration_date
from car_owners c
group by c.car_id
/
create materialized view log on latest_registration with sequence, rowid
(car_id, max_registration_date) including new values;
create materialized view newest_owner
refresh fast on commit enable query rewrite
as
select c.rowid row_id,cl.rowid cl_rowid, c.pk_col, c.car_id, c.person_id, c.registration_date
from car_owners c
join latest_registration cl
on c.registration_date = cl.max_registration_date
and c.car_id = cl.car_id
/
select * from newest_owner where car_id = 25;
ROW_ID CL_ROWID PK_COL CAR_ID PERSON_ID REGISTRAT
------------------ ------------------ ---------- ---------- ---------- ---------
AAAUreAAMAAD/IxABS AAAUriAAMAAD+TNACE 644158 25 44158 09-APR-22
insert into car_owners values (1000001, 25,-1,sysdate);
commit;
select * from newest_owner where car_id = 25;
ROW_ID CL_ROWID PK_COL CAR_ID PERSON_ID REGISTRAT
------------------ ------------------ ---------- ---------- ---------- ---------
AAAUreAAMAAD/pLAB1 AAAUriAAMAAD+TNACE 1000001 25 -1 22-APR-22
explain plan for
select c.rowid row_id,cl.rowid cl_rowid, c.pk_col, c.car_id, c.person_id, c.registration_date
from car_owners c
join latest_registration cl
on c.registration_date = cl.max_registration_date
and c.car_id = cl.car_id
/
select * from dbms_xplan.display();
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 999 | 41958 | 3 (0)| 00:00:01 |
| 1 | MAT_VIEW REWRITE ACCESS FULL| NEWEST_OWNER | 999 | 41958 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Have a read of the docs: https://docs.oracle.com/en/database/oracle/oracle-database/21/dwhsg/basic-materialized-views.html#GUID-E087FDD0-B08C-4878-BBA9-DE56A705835E https://docs.oracle.com/en/database/oracle/oracle-database/21/dwhsg/basic-materialized-views.html#GUID-179C8C8A-585B-49E6-8970-09396DB53DE3 there are some restrictions that can slow down your refreshes (eg deletes).

Table scan on internal tables

Assume I have 2 tables - TABLE-1 & TABLE-2 and each of the table has 1 million rows with 10 columns and index on col1..
Now I build a internal table on this 2 tables ( 1 + 1 = 2 million) rows,
select * from
(select col1, col2,....col10 from table-1
union all
select col1, col2,....col10 from table-2) x
Questions,
how will the internal table will be treated in Oracle since its a internal table..
1. Will the internal table will be treated as a table with index on col1?
2. Will this be captured in the Explain plan?
Yes and yes.
Oracle will effectively treat this inline view as a table. It can use predicate pushing to apply a filter on the inline view to the base tables, and potentially use an index. The explain plan will show this.
Tables, indexes, sample data, and statistics
create table table1(col1 number, col2 number, col3 number, col4 number);
create table table2(col1 number, col2 number, col3 number, col4 number);
create index table1_idx on table1(col1);
create index table2_idx on table2(col1);
insert into table1 select level, level, level, level
from dual connect by level <= 100000;
insert into table2 select level, level, level, level
from dual connect by level <= 100000;
commit;
begin
dbms_stats.gather_table_stats(user, 'TABLE1');
dbms_stats.gather_table_stats(user, 'TABLE2');
end;
/
Explain plan showing predicate pushing and index access
explain plan for
select * from
(
select col1, col2, col3, col4 from table1
union all
select col1, col2, col3, col4 from table2
)
where col1 = 1;
select * from table(dbms_xplan.display);
Plan hash value: 400235428
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 40 | 2 (0)| 00:00:01 |
| 1 | VIEW | | 2 | 40 | 2 (0)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
| 3 | TABLE ACCESS BY INDEX ROWID BATCHED| TABLE1 | 1 | 20 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | TABLE1_IDX | 1 | | 1 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID BATCHED| TABLE2 | 1 | 20 | 2 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | TABLE2_IDX | 1 | | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("COL1"=1)
6 - access("COL1"=1)
Notice how the predicates happen before the VIEW, and both indexes are used. By default everything should work as well as can be expected.
Notes
This type of query structure is called an inline view. Although a physical table is not built, the phrase "internal tables" is a good way of thinking about how the query logically works. Ideally, an inline view would work exactly like a pre-built table with the same data. In reality there are some cases where things don't quit work that way. But in general you are definitely on the right path - build a large query by assembling small inline views, and assume that Oracle will optimize it correctly.
for your particular query no any index will be used, but I suppose you do some filtering, ie where x.col1 = ###, I'm not sure that oracle will be able to use table-1/table-2 indexes to filter, so I suggest you to put where statements inside "union query"

Oracle primary key vs. index NOT IN performance

I have the following use case:
A table stores the changed as well as the original data from a person. My query is designed to get only one row for each person: The changed data if there is some, else the original data.
I populated the table with 100k rows of data and 2k of changed data. When using a primary key on my table the query runs in less than a half second. If I put an index on the table instead of a primary key the query runs really slow. So I'll use the primary key, no doubt about that.
My question is: Why is the PK approach so much faster than the one with an index?
Code here:
drop table up_data cascade constraints purge;
/
create table up_data(
pk integer,
hp_nr integer,
up_nr integer,
ps_flag varchar2(1),
ps_name varchar2(100)
-- comment this out and uncomment the index below.
, constraint pk_up_data primary key (pk,up_nr)
);
/
-- insert some data
insert into up_data
select rownum, 1, 0, 'A', 'tester_' || to_char(rownum)
from dual
connect by rownum < 100000;
/
-- insert some changed data
-- change ps_flag = 'B' and mark it with a change number in up_nr
insert into up_data
select rownum, 1, 1, 'B', 'tester_' || to_char(rownum)
from dual
connect by rownum < 2000;
/
-- alternative(?) to the primary key
-- CREATE INDEX idx_up_data ON up_data(pk, up_nr);
/
The select statement looks like this:
select count(*)
from
(
select *
from up_data u1
where up_nr = 1
or (up_nr = 0
and pk not in (select pk from up_data where up_nr = 1)
)
) u
The statement might be target of optimization but for the moment it will stay like this.
When you create a primary key constraint, Oracle also creates an index to support this at the same time. A primary key index has a couple of important differences over a basic index, namely:
All the values in this are guaranteed to be unique
There's no nulls in the table rows (of the columns forming the PK)
These reasons are the key to the performance differences you see. Using your setup, I get the following query plans:
--fast version with PK
explain plan for
select count(*)
from
(
select *
from up_data u1
where up_nr = 1
or (up_nr = 0
and pk not in (select pk from up_data where up_nr = 1)
)
) u
/
select * from table(dbms_xplan.display(NULL, NULL,'BASIC +ROWS'));
-----------------------------------------------------
| Id | Operation | Name | Rows |
-----------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
| 1 | SORT AGGREGATE | | 1 |
| 2 | FILTER | | |
| 3 | INDEX FAST FULL SCAN| PK_UP_DATA | 103K|
| 4 | INDEX UNIQUE SCAN | PK_UP_DATA | 1 |
-----------------------------------------------------
alter table up_data drop constraint pk_up_data;
CREATE INDEX idx_up_data ON up_data(pk, up_nr);
/
--slow version with normal index
explain plan for
select count(*)
from
(
select *
from up_data u1
where up_nr = 1
or (up_nr = 0
and pk not in (select pk from up_data where up_nr = 1)
)
) u
/
select * from table(dbms_xplan.display(NULL, NULL,'BASIC +ROWS'));
------------------------------------------------------
| Id | Operation | Name | Rows |
------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
| 1 | SORT AGGREGATE | | 1 |
| 2 | FILTER | | |
| 3 | INDEX FAST FULL SCAN| IDX_UP_DATA | 103K|
| 4 | INDEX FAST FULL SCAN| IDX_UP_DATA | 1870 |
------------------------------------------------------
The big difference is that the fast version employs a INDEX UNIQUE SCAN, rather than a INDEX FAST FULL SCAN in the second access of the table data.
From the Oracle docs (emphasis mine):
In contrast to an index range scan, an index unique scan must have
either 0 or 1 rowid associated with an index key. The database
performs a unique scan when a predicate references all of the columns
in a UNIQUE index key using an equality operator. An index unique scan
stops processing as soon as it finds the first record because no
second record is possible.
This optimization to stop processing proves to be a significant factor in this example. The fast version of your query:
Full scans ~103,000 index entries
For each one of these finds one matching row in the PK index and stop processing the second index further
The slow version:
Full scans ~103,000 index entries
For each one of these performs another scan of the 103,000 rows to find if there's any matches.
So to compare the work done:
With the PK, we have one fast full scan, then 103,000 lookups of one index value
With normal index, we have one fast full scan then 103,000 scans of 103,000 index entries - several orders of magnitude more work!
In this example, both the uniqueness of the primary key and the not null-ness of the index values are necessary to get the performance benefit:
-- create index as unique - we still get two fast full scans
drop index index idx_up_data;
create unique index idx_up_data ON up_data(pk, up_nr);
explain plan for
select count(*)
from
(
select *
from up_data u1
where up_nr = 1
or (up_nr = 0
and pk not in (select pk from up_data where up_nr = 1)
)
) u
/
select * from table(dbms_xplan.display(NULL, NULL,'BASIC +ROWS'));
------------------------------------------------------
| Id | Operation | Name | Rows |
------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
| 1 | SORT AGGREGATE | | 1 |
| 2 | FILTER | | |
| 3 | INDEX FAST FULL SCAN| IDX_UP_DATA | 103K|
| 4 | INDEX FAST FULL SCAN| IDX_UP_DATA | 1870 |
------------------------------------------------------
-- now the columns are not null, we see the index unique scan
alter table up_data modify (pk not null, up_nr not null);
explain plan for
select count(*)
from
(
select *
from up_data u1
where up_nr = 1
or (up_nr = 0
and pk not in (select pk from up_data where up_nr = 1)
)
) u
/
select * from table(dbms_xplan.display(NULL, NULL,'BASIC +ROWS'));
------------------------------------------------------
| Id | Operation | Name | Rows |
------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |
| 1 | SORT AGGREGATE | | 1 |
| 2 | FILTER | | |
| 3 | INDEX FAST FULL SCAN| IDX_UP_DATA | 103K|
| 4 | INDEX UNIQUE SCAN | IDX_UP_DATA | 1 |
------------------------------------------------------

Oracle Index with multiple Columns querying on single column

In a table in our Oracle installation we have a table with an index on two of the columns (X and Y). If I do a query on the table with a where clause only touching column X, will Oracle be able to use the index?
For example:
Table Y:
Col_A,
Col_B,
Col_C,
Index exists on (Col_A, Col_B)
SELECT * FROM Table_Y WHERE Col_A = 'STACKOVERFLOW';
Will the index be used, or will a table scan be done?
It depends.
You could check it by letting Oracle explain the execution plan:
EXPLAIN PLAN FOR
SELECT * FROM Table_Y WHERE Col_A = 'STACKOVERFLOW';
and then
select * from table(dbms_xplan.display);
So, for example with
create table table_y (
col_a varchar2(30),
col_b varchar2(30),
col_c varchar2(30)
);
create unique index table_y_ix on table_y (col_a, col_b);
and then a
explain plan for
select * from table_y
where col_a = 'STACKOVERFLOW';
select * from table(dbms_xplan.display);
The plan (on my installation) looks like:
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 51 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TABLE_Y | 1 | 51 | 1 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TABLE_Y_IX | 1 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("COL_A"='STACKOVERFLOW')
ID 2 shows you, that the index TABLE_Y_IX is indeed used for an index range scan.
If on another installation Oracle chooses to use the index is dependend on many things. It's Oracle's query optimizer that makes this decision.
Update If you feel you're be better off (performance wise, that is) if Oracle used the index, you might want to try the + index_asc(...) (see index hint)
So in your case that would be something like
SELECT /*+ index_asc(TABLE_Y TABLE_Y_IX) */ *
FROM Table_Y
WHERE Col_A = 'STACKOVERFLOW';
Additionally, I would ensure that you have gathered statistics on the table and its columns. You can check the date of the last gathering of statistics with a
select last_analyzed from dba_tables where table_name = 'TABLE_Y';
and
select column_name, last_analyzed from dba_tab_columns where table_name = 'TABLE_Y';
If there are no statistics or if they're stale, make yourself familiar with the dbms_stats package to gather such statistics.
These statistics are the data that the query optimizer relies on heavily to make its decisions.

How can I get a COUNT(col) ... GROUP BY to use an index?

I've got a table (col1, col2, ...) with an index on (col1, col2, ...). The table has got millions of rows in it, and I want to run a query:
SELECT col1, COUNT(col2) WHERE col1 NOT IN (<couple of exclusions>) GROUP BY col1
Unfortunately, this is resulting in a full table scan of the table, which takes upwards of a minute. Is there any way of getting oracle to use the index on the columns to return the results much faster?
EDIT:
more specifically, I'm running the following query:
SELECT owner, COUNT(object_name) FROM all_objects GROUP BY owner
and there is an index on SYS.OBJ$ (SYS.I_OBJ2) which indexes the owner# and name columns; I believe I should be able to use this index in the query, rather than a full table scan of SYS.OBJ$
I have had the chance to play around with this, and my previous comments regarding the NOT IN are a red herring in this case. The key thing is the presence of NULLs, or rather whether the indexed columns have NOT NULL constraints enforced.
This is going to depend on the version of the database you're using, because the optimizer gets smarter with each release. I'm using 11gR1 and the optimizer used the index in all cases except one: when both columns were null and I didn't include the NOT IN clause:
SQL> desc big_table
Name Null? Type
----------------------------------- ------ -------------------
ID NUMBER
COL1 NUMBER
COL2 VARCHAR2(30 CHAR)
COL3 DATE
COL4 NUMBER
Without the NOT IN clause...
SQL> explain plan for
2 select col4, count(col1) from big_table
3 group by col4
4 /
Explained.
SQL> select * from table(dbms_xplan.display)
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 1753714399
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 31964 | 280K| | 7574 (2)| 00:01:31 |
| 1 | HASH GROUP BY | | 31964 | 280K| 45M| 7574 (2)| 00:01:31 |
| 2 | TABLE ACCESS FULL| BIG_TABLE | 2340K| 20M| | 4284 (1)| 00:00:52 |
----------------------------------------------------------------------------------------
9 rows selected.
SQL>
When I dobbed the NOT IN clause back in, the optimizer opted to use the index. Weird.
SQL> explain plan for
2 select col4, count(col1) from big_table
3 where col1 not in (12, 19)
4 group by col4
5 /
Explained.
SQL> select * from table(dbms_xplan.display)
2 /
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 343952376
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 31964 | 280K| | 5057 (3)| 00:01:01 |
| 1 | HASH GROUP BY | | 31964 | 280K| 45M| 5057 (3)| 00:01:01 |
|* 2 | INDEX FAST FULL SCAN| BIG_I2 | 2340K| 20M| | 1767 (2)| 00:00:22 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
2 - filter("COL1"<>12 AND "COL1"<>19)
14 rows selected.
SQL>
Just to repeat, in all other cases, as long as one of the indexed columns was declared not nill, the index was used to satisfy the query. This may not be true on earlier versions of Oracle, but it probably points the way forward.
you could use a hint http://download.oracle.com/docs/cd/B10501_01/server.920/a96533/hintsref.htm ,
but remember that using an index might not always result in faster execution.
(Just in case, are you sure it's doing a table scan and not an index scan?)
Try using COUNT(*) instead of COUNT(col2) (assuming this is appropriate for you problem, of course). Also, maybe try an index with just col1.
You are querying against oracle's fixed tables, since you've not stated which db vesion this is, I'll assume a recent one. Have the fixed tables been analyzed and have updated statistics? Have you tried your query using the rule base optimizer by the use of the /*+ rule */ hint. Often I've seen that queries against oracle's own fixed tables perform better when the rule base optimizer is used.

Resources