i have 3 tables in a oracle 11g database. I don't have access to trace file or explain plan anymore. I join the 3 table on the date field like:
select * from a,b,c where a.date = b.date and b.date = c.date
and that takes forever.
when I
select * from a,b,c where a.date = b.date and b.date = c.date and a.date = c.date
its fast. but should that make a difference?
Not sure but it looks like a transitive dependency. that's to say if a.date = b.date and b.date = c.date then a.date = c.date. You can modify your query rather like
select a.*
from a
join b on a.date = b.date
join c on a.date = c.date;
I would also have a index on date column for all this 3 tables since that's the column you are joining on.
Apparently the database does not rewrite queries if the joins are such that A = B, B = C ==> A = C so it's stuck to using what its given.
Consider the following:
create table a (dt date);
create table b (dt date);
create table c (dt date);
Now fill in the tables so that a is the smallest (5 rows), b is the biggest (100 rows), and c is in the middle (50 rows). Also, so that not all rows in b and c will join to a just to make things a bit more interesting.
insert into a
select to_date('2015-01-01', 'yyyy-mm-dd') + rownum - 1
from dual
connect by level <= 5
;
insert into b
select to_date('2015-01-01', 'yyyy-mm-dd') + mod(rownum, 10)
from dual
connect by level <= 100
;
insert into c
select to_date('2015-01-01', 'yyyy-mm-dd') + mod(rownum, 10)
from dual
connect by level <= 50
;
I'm going to bypass statistics for now and leave it totally up to the database on how to figure out a plan.
Take 1: without the join from a to c:
explain plan for
select *
from a
, b
, c
where a.dt = b.dt
and b.dt = c.dt
;
and here's the plan:
select *
from table(dbms_xplan.display())
;
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 250 | 6750 | 9 (0)| 00:00:01 |
|* 1 | HASH JOIN | | 250 | 6750 | 9 (0)| 00:00:01 |
|* 2 | HASH JOIN | | 50 | 900 | 6 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| A | 5 | 45 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| B | 100 | 900 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | C | 50 | 450 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("B"."DT"="C"."DT")
2 - access("A"."DT"="B"."DT")
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
First off, since there were no statistics on the tables, Oracle chose to sample the data first so it wasn't going in blind. In this case, table a joins to b first, then the result of that joins to c.
Take 2: introduce the a.dt = c.dt condition:
explain plan for
select *
from a
, b
, c
where a.dt = b.dt
and b.dt = c.dt
and a.dt = c.dt
;
select *
from table(dbms_xplan.display())
;
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 25 | 675 | 9 (0)| 00:00:01 |
|* 1 | HASH JOIN | | 25 | 675 | 9 (0)| 00:00:01 |
|* 2 | HASH JOIN | | 25 | 450 | 6 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| A | 5 | 45 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| C | 50 | 450 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | B | 100 | 900 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."DT"="B"."DT" AND "B"."DT"="C"."DT")
2 - access("A"."DT"="C"."DT")
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
And there you go. The order of the joins has switched now that Oracle has been given the extra join path. (FYI, this is the same plan if using just a.dt = b.dt and a.dt = c.dt.)
BUT, notice anything? The estimates are not right anymore. It's guessing 25 rows in the end, not 250. So, the extra condition is actually causing some confusion.
Without the b.dt = c.dt, though, same join path, different estimates (same end result as the first one):
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 250 | 6750 | 9 (0)| 00:00:01 |
|* 1 | HASH JOIN | | 250 | 6750 | 9 (0)| 00:00:01 |
|* 2 | HASH JOIN | | 25 | 450 | 6 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| A | 5 | 45 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| C | 50 | 450 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | B | 100 | 900 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."DT"="B"."DT")
2 - access("A"."DT"="C"."DT")
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
Long story a little longer, since the database isn't going to assume any join paths for you, adding one in your query gives the database more options and as such can change its plan...and a change in plan can certainly affect how fast the results are returned.
This is Your Query.....
select * from a,b,c where a.date = b.date and .date = c.date and a.date = c.date
Now As per my view ..
SELECT * FROM a
JOIN B USING(date)
JOIN C USING(date);
Related
In the example below Oracle's optimizer's estimated rows is incorrect by two orders of magnitude. How do I improve the estimated rows?
Table A has rows with numbers 1 through 1,000 for each of the 10 letters A through J.
Table C has 100 copies of table A.
So, table A has a cardinality of 10K and table C has a cardinality of 1M.
A given single-valued predicate on the number in table A will yield 1/1000 of the rows in table A (same for table C).
A given single-valued predicate on the letter in table A will yield 1/10 of the rows in table A (same for table C).
Setup script.
drop table C;
drop table A;
create table A
( num NUMBER
, val VARCHAR2(3 byte)
, pad CHAR(40 byte)
)
;
insert /*+ append enable_parallel_dml parallel (auto) */
into A (num, val, pad)
select mod(level-1, 1000) +1
, chr(mod(ceil(level/1000) - 1, 10) + ascii('A'))
, ' '
from dual
connect by level <= 10*1000
;
create table C
( id NUMBER
, num NUMBER
, val VARCHAR2(3 byte)
, pad CHAR(40 byte)
)
;
insert /*+ append enable_parallel_dml parallel (auto) */
into C (id, num, val, pad)
with
"D1" as
( select /*+ materialize */ null from dual connect by level <= 100 --320
)
, "D" as
( select /*+ materialize */
level rn
, mod(level-1, 1000) + 1 num
, chr(mod(ceil(level/1000) - 1, 10) + ascii('A')) val
, ' ' pad
from dual
connect by level <= 10*1000
order by 1 offset 0 rows
)
select rownum id
, num num
, val val
, pad pad
from "D1", "D"
;
commit;
exec dbms_stats.gather_table_stats(OwnName => null, TabName => 'A', cascade => true);
exec dbms_stats.gather_table_stats(OwnName => null, TabName => 'C', cascade => true);
Consider the explain plan to the following query.
select *
from A
join C
on A.num = C.num
and A.val = C.val
where A.num = 1
and A.val = 'A'
;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 9900 | 2209 (1)| 00:00:01 |
|* 1 | HASH JOIN | | 100 | 9900 | 2209 (1)| 00:00:01 |
|* 2 | TABLE ACCESS FULL| A | 1 | 47 | 23 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| C | 100 | 5200 | 2185 (1)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."NUM"="C"."NUM" AND "A"."VAL"="C"."VAL")
2 - filter("A"."NUM"=1 AND "A"."VAL"='A')
3 - filter("C"."NUM"=1 AND "C"."VAL"='A')
The row cardinality of each step makes sense to me.
ID=2 --> (1/1,000) * (1/10) * 10,000 = 1
ID=3 --> (1/1,000) * (1/10) * 1,000,000 = 100
ID=1 --> 100 is correct. Predicates in ID=2 and ID=3 are the same, every row from ID=2 will have one and only one match in the row source from ID=3.
Now consider the explain plan to the slightly modified query below.
select *
from A
join C
on A.num = C.num
and A.val = C.val
where A.num in(1,2)
and A.val = 'A'
;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 198 | 2209 (1)| 00:00:01 |
|* 1 | HASH JOIN | | 2 | 198 | 2209 (1)| 00:00:01 |
|* 2 | TABLE ACCESS FULL| A | 2 | 94 | 23 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| C | 200 | 10400 | 2185 (1)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."NUM"="C"."NUM" AND "A"."VAL"="C"."VAL")
2 - filter("A"."VAL"='A' AND ("A"."NUM"=1 OR "A"."NUM"=2))
3 - filter("C"."VAL"='A' AND ("C"."NUM"=1 OR "C"."NUM"=2))
The row cardinality of each step ID=2 and ID=3 makes sense to me, but now ID=1 is incorrect by two orders of magnitude.
ID=2 --> (1/1,000)(1/10) * 10,000 = 1
ID=3 --> (1/1,000)(1/10) * 1,000,000 = 100
ID=1 --> The optimizer's estimate is two orders of magnitude different from the actual.
Adding unique and foreign constraints and extended statistics did not improve the estimated row counts.
create unique index IU_A on A (num, val);
alter table A add constraint UK_A unique (num, val) rely using index IU_A enable validate;
alter table C add constraint R_C foreign key (num, val) references A (num, val) rely enable validate;
create index IR_C on C (num, val);
select dbms_stats.create_extended_stats(null,'A','(num, val)') from dual;
select dbms_stats.create_extended_stats(null,'C','(num, val)') from dual;
exec dbms_stats.gather_table_stats(OwnName => null, TabName => 'A', cascade => true);
exec dbms_stats.gather_table_stats(OwnName => null, TabName => 'C', cascade => true);
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 198 | 10 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 2 | 198 | 10 (0)| 00:00:01 |
| 3 | INLIST ITERATOR | | | | | |
| 4 | TABLE ACCESS BY INDEX ROWID| A | 2 | 94 | 5 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | IU_A | 2 | | 3 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | IR_C | 1 | | 2 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID | C | 1 | 52 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access(("A"."NUM"=1 OR "A"."NUM"=2) AND "A"."VAL"='A')
6 - access("A"."NUM"="C"."NUM" AND "C"."VAL"='A')
filter("C"."NUM"=1 OR "C"."NUM"=2)
What do I need to do to make the estimated rows better match reality?
Using Oracle Enterprise Edition 19c.
Thanks in advance.
Edit
After ensuring the most recent optimizer_features_enable was used and modifying one of the predicates, we still have an explain plan whose estimated row count is short by two orders of magnitude.
ID=6 ought to have an estimated rows of 100. It seems it is applying the predicate factor twice. Once for the access and again for the filter.
select /*+ optimizer_features_enable('19.1.0') */
*
from A
join C
on A.num = C.num
and A.val = C.val
where A.num in(1,2)
and A.val in('A','B')
;
-----------------------------------------------------------------------------------------------
| id | Operation | name | rows | Bytes | cost (%CPU)| time |
-----------------------------------------------------------------------------------------------
| 0 | select statement | | 4 | 396 | 16 (0)| 00:00:01 |
| 1 | nested LOOPS | | 4 | 396 | 16 (0)| 00:00:01 |
| 2 | nested LOOPS | | 4 | 396 | 16 (0)| 00:00:01 |
| 3 | INLIST ITERATOR | | | | | |
| 4 | table access by index ROWID BATCHED| A | 4 | 188 | 7 (0)| 00:00:01 |
|* 5 | index range scan | IU_A | 4 | | 3 (0)| 00:00:01 |
|* 6 | index range scan | IR_C | 1 | | 2 (0)| 00:00:01 |
| 7 | table access by index ROWID | C | 1 | 52 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("A"."NUM"=1 or "A"."NUM"=2)
filter("A"."VAL"='A' or "A"."VAL"='B')
6 - access("A"."NUM"="C"."NUM" and "A"."VAL"="C"."VAL")
filter(("C"."NUM"=1 or "C"."NUM"=2) and ("C"."VAL"='A' or "C"."VAL"='B'))
Why does Oracle still apply a filter predicate on an index even after the access predicate for that same index guarantees the filter predicate is always true?
drop table index_filter_child
;
drop table index_filter_parent
;
create table index_filter_parent
as
select level id, chr(mod(level - 1, 26) + ascii('A')) code from dual connect by level <= 26
;
create table index_filter_child
as
with
"C" as (select chr(mod(level - 1, 26) + ascii('A')) code from dual connect by level <= 26)
select rownum id, C1.code from C C1, C C2
;
exec dbms_stats.gather_table_stats('USER','INDEX_FILTER_PARENT')
;
exec dbms_stats.gather_table_stats('USER','INDEX_FILTER_CHILD')
;
create index ix_index_filter_parent on index_filter_parent(code)
;
create index ix_index_filter_child on index_filter_child(code)
;
select P.*
from index_filter_parent "P"
join index_filter_child "C"
on C.code = P.code
where P.code in('A','Z') --same result if we predicate instead on C.code in('A','Z')
;
--------------------------------------------------------------------------------------------------------------
| id | Operation | name | rows | Bytes | cost (%CPU)| time |
--------------------------------------------------------------------------------------------------------------
| 0 | select statement | | 5 | 35 | 4 (0)| 00:00:01 |
| 1 | nested LOOPS | | 5 | 35 | 4 (0)| 00:00:01 |
| 2 | INLIST ITERATOR | | | | | |
| 3 | table access by index ROWID| INDEX_FILTER_PARENT | 2 | 10 | 2 (0)| 00:00:01 |
|* 4 | index range scan | IX_INDEX_FILTER_PARENT | 2 | | 1 (0)| 00:00:01 |
|* 5 | index range scan | IX_INDEX_FILTER_CHILD | 2 | 4 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("P"."CODE"='A' or "P"."CODE"='Z')
5 - access("C"."CODE"="P"."CODE")
filter("C"."CODE"='A' or "C"."CODE"='Z') <========== why is this needed?
Why is the filter predicate in 5 needed in light of the access("C"."CODE"="P"."CODE") guaranteeing C.code is 'A' or 'Z'?
Thank you in advance.
Oracle 12.1 enterprise Edition.
This is a result of "transitive closure" transformation: you can read more about here:
Transitivity and Transitive Closure (Doc ID 68979.1) Doc id 68979.1
Jonathan Lewis - Cartesian Merge Join
Jonathan Lewis - Transitive Closure (or, even better, in his book "Cost Based Oracle Fundamentals")
If you get CBO trace (alter session set events '10053 trace name context forever, level 1' or alter session set events 'trace[SQL_Optimizer.*]), you will see that the transformation happens before choosing join method and access paths. It allows CBO to analyze more different access paths and choose the best available plan. Moreover, in case of adaptive plans, it allows oracle to change join method on the fly.
For example, you can get a plan like this:
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 52 | 364 | 4 (0)| 00:00:01 |
|* 1 | HASH JOIN | | 52 | 364 | 4 (0)| 00:00:01 |
| 2 | INLIST ITERATOR | | | | | |
| 3 | TABLE ACCESS BY INDEX ROWID BATCHED| INDEX_FILTER_PARENT | 2 | 10 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | IX_INDEX_FILTER_PARENT | 2 | | 1 (0)| 00:00:01 |
| 5 | INLIST ITERATOR | | | | | |
|* 6 | INDEX RANGE SCAN | IX_INDEX_FILTER_CHILD | 52 | 104 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("C"."CODE"="P"."CODE")
4 - access("P"."CODE"='A' OR "P"."CODE"='Z')
6 - access("C"."CODE"='A' OR "C"."CODE"='Z')
In fact, you can disable it using the event 10155: CBO disable generation of transitive OR-chains.
Your example:
alter session set events '10155';
explain plan for
select P.*
from index_filter_parent "P"
join index_filter_child "C"
on C.code = P.code
where P.code in('A','Z');
Results:
SQL> alter session set events '10155';
Session altered.
SQL> explain plan for
2 select P.*
3 from index_filter_parent "P"
4 join index_filter_child "C"
5 on C.code = P.code
6 where P.code in('A','Z') ;
Explained.
SQL> #xplan typical
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------
Plan hash value: 2543178509
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 52 | 364 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 52 | 364 | 4 (0)| 00:00:01 |
| 2 | INLIST ITERATOR | | | | | |
| 3 | TABLE ACCESS BY INDEX ROWID BATCHED| INDEX_FILTER_PARENT | 2 | 10 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | IX_INDEX_FILTER_PARENT | 2 | | 1 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | IX_INDEX_FILTER_CHILD | 26 | 52 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("P"."CODE"='A' OR "P"."CODE"='Z')
5 - access("C"."CODE"="P"."CODE")
Note
-----
- this is an adaptive plan
22 rows selected.
As you can see, that predicate has disappeared.
PS. Other events for transitive predicates:
ORA-10155: CBO disable generation of transitive OR-chains
ORA-10171: CBO disable transitive join predicates
ORA-10179: CBO turn off transitive predicate replacement
ORA-10195: CBO don't use check constraints for transitive predicates
How to optimize an update query:
UPDATE frst_usage vfm1 SET
(vfm1.account_dn,
vfm1.usage_date,
vfm1.country,
vfm1.feature_name,
vfm1.hu_type,
vfm1.make,
vfm1.region,
vfm1.service_hits,
vfm1.maint_last_ts,
vfm1.accountdn_hashcode) = (
SELECT
(SELECT vst.account_dn FROM services_track vst WHERE vst.accountdn_hashcode = vrd1.account_dn_hashcode AND rownum = 1),
min(usage_date),
country,
feature_name,
hu_type,
make,
region,
service_hits,
SYSDATE,
account_dn_hashcode
FROM raw_data vrd1
WHERE vrd1.vin_hashcode = vfm1.vin_hashcode
AND vrd1.usage_date IS NOT NULL AND rownum = 1
GROUP BY account_dn, country, feature_name, hu_type, make, region, service_hits, vfm1.maint_last_ts, account_dn_hashcode
);
the tables have indexes on all the columns available in the where conditions.
Still the execution is taking more than 4 hours. Below is the explain plan
From the execution plan i could see that the select is good but the update is consuming more time resources, Is there a way i could optimize this.
I think correlated subqueries may be an issue:
WHERE vrd1.vin_hashcode = vfm1.vin_hashcode
You should try merge clause, it could have dramatic impact on performance
http://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9016.htm
Below is example similar to yours. 10k sample rows, all columns indexed and statistics gathered:
Update (16s)
SQL> update x1 set (v1, v2, v3, v4) =
2 (
3 select v1, v2, v3, min(v4)
4 from x2
5 where x1.nr = x2.nr
6 group by v1,v2,v3
7 );
9999 rows updated.
Elapsed: 00:00:16.56
Execution Plan
----------------------------------------------------------
Plan hash value: 3497322513
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 9999 | 859K| 1679K (5)| 05:35:59 |
| 1 | UPDATE | X1 | | | | |
| 2 | TABLE ACCESS FULL | X1 | 9999 | 859K| 40 (0)| 00:00:01 |
| 3 | SORT GROUP BY | | 1 | 88 | 41 (3)| 00:00:01 |
|* 4 | TABLE ACCESS FULL| X2 | 1 | 88 | 40 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter("X2"."NR"=:B1)
Merge(1,5s)
SQL> merge into x1 using (
2 select nr, v1, v2, v3, min(v4) v4
3 from x2
4 group by nr, v1,v2,v3
5 ) xx2
6 on (x1.nr = xx2.nr)
7 when matched then update set
8 x1.v1 = xx2.v1, x1.v2 = xx2.v2, x1.v3 = xx2.v3, x1.v4 = xx2.v4;
9999 rows merged.
Elapsed: 00:00:01.25
Execution Plan
----------------------------------------------------------
Plan hash value: 1113810112
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 9999 | 58M| | 285 (1)| 00:00:04 |
| 1 | MERGE | X1 | | | | | |
| 2 | VIEW | | | | | | |
|* 3 | HASH JOIN | | 9999 | 58M| | 285 (1)| 00:00:04 |
| 4 | TABLE ACCESS FULL | X1 | 9999 | 859K| | 40 (0)| 00:00:01 |
| 5 | VIEW | | 9999 | 57M| | 244 (1)| 00:00:03 |
| 6 | SORT GROUP BY | | 9999 | 859K| 1040K| 244 (1)| 00:00:03 |
| 7 | TABLE ACCESS FULL| X2 | 9999 | 859K| | 40 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("X1"."NR"="XX2"."NR")
Indexes on the the target table reduce the performance.Disable the index before updating the table and rebuild the index once the update completed.
Able to solve this using below query, and now the explain plan looks good.
merge INTO FRST_USAGE vfu USING
(SELECT tmp1.*
FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY tmp.vin_hash ORDER BY tmp.usage_date) AS rn,
tmp.*
FROM
(SELECT vrd.VIN_HASHCODE AS vin_hash,
vrd.ACCOUNT_DN_HASHCODE AS actdn_hash,
vst.ACCOUNT_DN AS actdn,
vrd.FEATURE_NAME AS feature,
vrd.MAKE AS make,
vrd.COUNTRY AS country,
vrd.HU_TYPE AS hu,
vrd.REGION AS region,
vrd.SERVICE_HITS AS hits,
MIN(vrd.USAGE_DATE) AS usage_date,
sysdate AS maintlastTs
FROM RAW_DATA vrd,
SERVICES_TRACK vst
WHERE vrd.ACCOUNT_DN_HASHCODE=vst.ACCOUNTDN_HASHCODE
GROUP BY vrd.VIN_HASHCODE,
vrd.ACCOUNT_DN_HASHCODE,
vst.ACCOUNT_DN,
vrd.FEATURE_NAME,
vrd.MAKE,
vrd.COUNTRY,
vrd.HU_TYPE,
vrd.REGION,
vrd.SERVICE_HITS,
sysdate
ORDER BY vrd.VIN_HASHCODE,
MIN(vrd.USAGE_DATE)
) tmp
)tmp1
WHERE tmp1.rn =1
) tmp2 ON (vfu.VIN_HASHCODE = tmp2.vin_hash)
WHEN matched THEN
UPDATE
SET vfu.ACCOUNTDN_HASHCODE=tmp2.actdn_hash,
vfu.account_dn =tmp2.actdn,
vfu.FEATURE_NAME =tmp2.feature,
vfu.MAKE =tmp2.make,
vfu.COUNTRY =tmp2.country,
vfu.HU_TYPE =tmp2.hu,
vfu.REGION =tmp2.region,
vfu.SERVICE_HITS =tmp2.hits,
vfu.usage_date =tmp2.usage_date,
vfu.MAINT_LAST_TS =tmp2.maintlastTs;
below is the Explan plan:
Suggestions are allowed if there is any more optimizations i can do on this.
What is the best practice to convert the following sql statement using a subquery (with data as clause) to use it in a database view.
AFAIK the with data as clause is not supported in database views (Edited: Oracle supports Common Table Expressions), but in my case the subquery factoring offers advantage for performance. If I create a database view using Common Table Expression, than this advantage is lost.
Please have a look at my example:
Description of query
a_table
Millions of entries, by the select statement a few thousand are selected.
anchor_table
For each entry in a_table exists a corresponding entry in anchor_table. By this table is determined at runtime exactly one row as anchor. See example below.
horizon_table
For each selection exactly one entry is determined at runtime (all entries of a selection of a_table have the same horizon_id)
Please notice: This is a strongly simplified sql that works fine so far.
In reality more than 20 tables are joined together to get the results of data.
The where clause is much more complex.
Further columns of horizon_table and anchor_table are required to prepare my where condition and result list in the subquery, i.e. moving these tables to the main query is no solution.
with data as (
select
a_table.id,
a_table.descr,
horizon_table.offset,
case
when anchor_table.a_date = trunc(sysdate) then
1
else
0
end as anchor,
row_number() over(
order by a_table.a_position_field) as position
from a_table
join anchor_table on (anchor_table.id = a_table.anchor_id)
join horizon_table on (horizon_table.id = a_table.horizon_id)
where a_table.a_value between 1 and 10000
)
select *
from data d
where d.position between (
select d1.position - d.offset
from data d1
where d1.anchor = 1)
and (
select d2.position + d.offset
from data d2
where d2.anchor = 1)
example of with data as select:
id descr offset anchor position
1 bla 3 0 1
2 blab 3 0 2
5 dfkdj 3 0 3
4 dld 3 0 4
6 oeroe 3 1 5
3 blab 3 0 6
9 dfkdj 3 0 7
14 dld 3 0 8
54 oeroe 3 0 9
...
result of select * from data
id descr offset anchor position
2 blab 3 0 2
5 dfkdj 3 0 3
4 dld 3 0 4
6 oeroe 3 1 5
3 blab 3 0 6
9 dfkdj 3 0 7
14 dld 3 0 8
I.E. the result is the anchor row and the tree rows above and below.
How can I achieve the same within a database view?
My attempt failed as I expected by performance issues:
Create a view data of with data as select above
Use this view as above
select *
from data d
where d.position between (
select d1.position - d.offset
from data d1
where d1.anchor = 1)
and (
select d2.position + d.offset
from data d2
where d2.anchor = 1)
Thank you for any advice :-)
Amendment
If I create a view as recommended in first comment, than I get the same performance issue. Oracle does not use the subquery to restrict the results.
Here are the execution plans of my production queries (please click at the images)
a) SQL
b) View
Here are the execution plans of my test cases
-- Create Testdata table with ~ 1,000,000 entries
insert into a_table
(id, descr, a_position_field, anchor_id, horizon_id, a_value)
select level, 'data' || level, mod(level, 10), level, 1, level
from dual
connect by level <= 999999;
insert into anchor_table
(id, a_date)
select level, trunc(sysdate) - 500000 + level
from dual
connect by level <= 999999;
insert into horizon_table (id, offset) values (1, 50);
commit;
-- Create view
create or replace view testdata_vw as
with data as
(select a_table.id,
a_table.descr,
a_table.a_value,
horizon_table.offset,
case
when anchor_table.a_date = trunc(sysdate) then
1
else
0
end as anchor,
row_number() over(order by a_table.a_position_field) as position
from a_table
join anchor_table
on (anchor_table.id = a_table.anchor_id)
join horizon_table
on (horizon_table.id = a_table.horizon_id))
select *
from data d
where d.position between
(select d1.position - d.offset from data d1 where d1.anchor = 1) and
(select d2.position + d.offset from data d2 where d2.anchor = 1);
-- Explain plan of subquery factoring select statement
explain plan for
with data as
(select a_table.id,
a_table.descr,
a_value,
horizon_table.offset,
case
when anchor_table.a_date = trunc(sysdate) then
1
else
0
end as anchor,
row_number() over(order by a_table.a_position_field) as position
from a_table
join anchor_table
on (anchor_table.id = a_table.anchor_id)
join horizon_table
on (horizon_table.id = a_table.horizon_id)
where a_table.a_value between 500000 - 500 and 500000 + 500)
select *
from data d
where d.position between
(select d1.position - d.offset from data d1 where d1.anchor = 1) and
(select d2.position + d.offset from data d2 where d2.anchor = 1);
select plan_table_output
from table(dbms_xplan.display('plan_table', null, null));
/*
Note: Size of SYS_TEMP_0FD9D6628_284C5768 ~ 1000 rows
Plan hash value: 1145408420
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 62 | 1791 (2)| 00:00:31 |
| 1 | TEMP TABLE TRANSFORMATION | | | | | |
| 2 | LOAD AS SELECT | SYS_TEMP_0FD9D6628_284C5768 | | | | |
| 3 | WINDOW SORT | | 57 | 6840 | 1785 (2)| 00:00:31 |
|* 4 | HASH JOIN | | 57 | 6840 | 1784 (2)| 00:00:31 |
|* 5 | TABLE ACCESS FULL | A_TABLE | 57 | 4104 | 1193 (2)| 00:00:21 |
| 6 | MERGE JOIN CARTESIAN | | 1189K| 54M| 586 (2)| 00:00:10 |
| 7 | TABLE ACCESS FULL | HORIZON_TABLE | 1 | 26 | 3 (0)| 00:00:01 |
| 8 | BUFFER SORT | | 1189K| 24M| 583 (2)| 00:00:10 |
| 9 | TABLE ACCESS FULL | ANCHOR_TABLE | 1189K| 24M| 583 (2)| 00:00:10 |
|* 10 | FILTER | | | | | |
| 11 | VIEW | | 57 | 3534 | 2 (0)| 00:00:01 |
| 12 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6628_284C5768 | 57 | 4104 | 2 (0)| 00:00:01 |
|* 13 | VIEW | | 57 | 912 | 2 (0)| 00:00:01 |
| 14 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6628_284C5768 | 57 | 4104 | 2 (0)| 00:00:01 |
|* 15 | VIEW | | 57 | 912 | 2 (0)| 00:00:01 |
| 16 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6628_284C5768 | 57 | 4104 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("HORIZON_TABLE"."ID"="A_TABLE"."HORIZON_ID" AND
"ANCHOR_TABLE"."ID"="A_TABLE"."ANCHOR_ID")
5 - filter("A_TABLE"."A_VALUE">=499500 AND "A_TABLE"."A_VALUE"<=500500)
10 - filter("D"."POSITION">= (SELECT "D1"."POSITION"-:B1 FROM (SELECT + CACHE_TEMP_TABLE
("T1") "C0" "ID","C1" "DESCR","C2" "A_VALUE","C3" "OFFSET","C4" "ANCHOR","C5" "POSITION" FROM
"SYS"."SYS_TEMP_0FD9D6628_284C5768" "T1") "D1" WHERE "D1"."ANCHOR"=1) AND "D"."POSITION"<=
(SELECT "D2"."POSITION"+:B2 FROM (SELECT + CACHE_TEMP_TABLE ("T1") "C0" "ID","C1"
"DESCR","C2" "A_VALUE","C3" "OFFSET","C4" "ANCHOR","C5" "POSITION" FROM
"SYS"."SYS_TEMP_0FD9D6628_284C5768" "T1") "D2" WHERE "D2"."ANCHOR"=1))
13 - filter("D1"."ANCHOR"=1)
15 - filter("D2"."ANCHOR"=1)
Note
-----
- dynamic sampling used for this statement (level=4)
*/
-- Explain plan of database view
explain plan for
select *
from testdata_vw
where a_value between 500000 - 500 and 500000 + 500;
select plan_table_output
from table(dbms_xplan.display('plan_table', null, null));
/*
Note: Size of SYS_TEMP_0FD9D662A_284C5768 ~ 1000000 rows
Plan hash value: 1422141561
-------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2973 | 180K| | 50324 (1)| 00:14:16 |
| 1 | VIEW | TESTDATA_VW | 2973 | 180K| | 50324 (1)| 00:14:16 |
| 2 | TEMP TABLE TRANSFORMATION | | | | | | |
| 3 | LOAD AS SELECT | SYS_TEMP_0FD9D662A_284C5768 | | | | | |
| 4 | WINDOW SORT | | 1189K| 136M| 147M| 37032 (1)| 00:10:30 |
|* 5 | HASH JOIN | | 1189K| 136M| | 6868 (1)| 00:01:57 |
| 6 | TABLE ACCESS FULL | HORIZON_TABLE | 1 | 26 | | 3 (0)| 00:00:01 |
|* 7 | HASH JOIN | | 1189K| 106M| 38M| 6860 (1)| 00:01:57 |
| 8 | TABLE ACCESS FULL | ANCHOR_TABLE | 1189K| 24M| | 583 (2)| 00:00:10 |
| 9 | TABLE ACCESS FULL | A_TABLE | 1209K| 83M| | 1191 (2)| 00:00:21 |
|* 10 | FILTER | | | | | | |
|* 11 | VIEW | | 1189K| 70M| | 4431 (1)| 00:01:16 |
| 12 | TABLE ACCESS FULL | SYS_TEMP_0FD9D662A_284C5768 | 1189K| 81M| | 4431 (1)| 00:01:16 |
|* 13 | VIEW | | 1189K| 18M| | 4431 (1)| 00:01:16 |
| 14 | TABLE ACCESS FULL | SYS_TEMP_0FD9D662A_284C5768 | 1189K| 81M| | 4431 (1)| 00:01:16 |
|* 15 | VIEW | | 1189K| 18M| | 4431 (1)| 00:01:16 |
| 16 | TABLE ACCESS FULL | SYS_TEMP_0FD9D662A_284C5768 | 1189K| 81M| | 4431 (1)| 00:01:16 |
-------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("HORIZON_TABLE"."ID"="A_TABLE"."HORIZON_ID")
7 - access("ANCHOR_TABLE"."ID"="A_TABLE"."ANCHOR_ID")
10 - filter("D"."POSITION">= (SELECT "D1"."POSITION"-:B1 FROM (SELECT + CACHE_TEMP_TABLE ("T1")
"C0" "ID","C1" "DESCR","C2" "A_VALUE","C3" "OFFSET","C4" "ANCHOR","C5" "POSITION" FROM
"SYS"."SYS_TEMP_0FD9D662A_284C5768" "T1") "D1" WHERE "D1"."ANCHOR"=1) AND "D"."POSITION"<= (SELECT
"D2"."POSITION"+:B2 FROM (SELECT + CACHE_TEMP_TABLE ("T1") "C0" "ID","C1" "DESCR","C2"
"A_VALUE","C3" "OFFSET","C4" "ANCHOR","C5" "POSITION" FROM "SYS"."SYS_TEMP_0FD9D662A_284C5768" "T1") "D2"
WHERE "D2"."ANCHOR"=1))
11 - filter("A_VALUE">=499500 AND "A_VALUE"<=500500)
13 - filter("D1"."ANCHOR"=1)
15 - filter("D2"."ANCHOR"=1)
Note
-----
- dynamic sampling used for this statement (level=4)
*/
sqlfiddle
explain plan of sql http://www.sqlfiddle.com/#!4/6a7022/3
explain plan of view http://www.sqlfiddle.com/#!4/6a7022/2
You need to write a view definition which returns all possible selectable ranges of a_value as two columns, start_a_value and end_a_value, along with all records which fall into each start/end range. In other words, the correct view definition should logically describe a |n^3| result set given n rows in a_table.
Then query that view as:
SELECT * FROM testdata_vw WHERE START_A_VALUE = 4950 AND END_A_VALUE = 5050;
Also, your multiple references to "data" are unnecessary; same logic can be delivered with an additional analytic function.
Final view def:
CREATE OR REPLACE VIEW testdata_vw AS
SELECT *
FROM
(
SELECT T.*,
MAX(CASE WHEN ANCHOR=1 THEN POSITION END)
OVER (PARTITION BY START_A_VALUE, END_A_VALUE) ANCHOR_POS
FROM
(
SELECT S.A_VALUE START_A_VALUE,
E.A_VALUE END_A_VALUE,
B.ID ID,
B.DESCR DESCR,
HORIZON_TABLE.OFFSET OFFSET,
CASE
WHEN ANCHOR_TABLE.A_DATE = TRUNC(SYSDATE)
THEN 1
ELSE 0
END ANCHOR,
ROW_NUMBER()
OVER(PARTITION BY S.A_VALUE, E.A_VALUE
ORDER BY B.A_POSITION_FIELD) POSITION
FROM
A_TABLE S
JOIN A_TABLE E
ON S.A_VALUE<E.A_VALUE
JOIN A_TABLE B
ON B.A_VALUE BETWEEN S.A_VALUE AND E.A_VALUE
JOIN ANCHOR_TABLE
ON ANCHOR_TABLE.ID = B.ANCHOR_ID
JOIN HORIZON_TABLE
ON HORIZON_TABLE.ID = B.HORIZON_ID
) T
) T
WHERE POSITION BETWEEN ANCHOR_POS - OFFSET AND ANCHOR_POS+OFFSET;
EDIT: SQL Fiddle with expected execution plan
I'm seeing the same (sensible) plan here that I saw in my database; if you're getting something different, please send fiddle link.
Use index lookup to find 1 row in "S" A_TABLE (A_VALUE = 4950)
Use index lookup to find 1 row in "E" A_TABLE (A_VALUE = 5050)
Nested Loop join #1 and #2 (1 x 1 join, still 1 row)
FTS 1 row from HORIZON table
Cartesian join #1 and #2 (1 x 1, okay to use Cartesian).
Use index lookup to find ~100 rows in "B" A_TABLE with values between 4950 and 5050.
Cartesian join #5 and #6 (1 x 102, okay to use Cartesian).
FTS ANCHOR_TABLE with hash join to #7.
Window-sort for analytic functions
You have a predicate outside the view and you want to be applied in the view.
For this, you can use push_pred hint:
select /*+PUSH_PRED(v)*/
*
from
testdata_vw v
where
a_value between 5000 - 50 and 5000 + 50;
SQLFIDDLE
EDIT: Now I've seen that you use the data subquery three times. For the first occurrence it makes sense to push the predicate, but for d1 and d2 it doesn't. It's another query.
What would I do is to use two context variables, set them according my needs and write the query:
SYS_CONTEXT('my_context_name', 'var5000');
create or replace view testdata_vw as
with data as (
select
a_table.id,
a_table.descr,
horizon_table.offset,
case
when anchor_table.a_date = trunc(sysdate) then
1
else
0
end as anchor,
row_number() over(
order by a_table.a_position_field) as position
from a_table
join anchor_table on (anchor_table.id = a_table.anchor_id)
join horizon_table on (horizon_table.id = a_table.horizon_id)
where a_table.a_value between SYS_CONTEXT('my_context_name', 'var5000') - SYS_CONTEXT('my_context_name', 'var50') and SYS_CONTEXT('my_context_name', 'var5000') + SYS_CONTEXT('my_context_name', 'var50')
)
select *
from data d
where d.position between (
select d1.position - d.offset
from data d1
where d1.anchor = 1)
and (
select d2.position + d.offset
from data d2
where d2.anchor = 1) ;
to use it:
dbms_session.set_context ('my_context_name', 'var5000', 5000);
dbms_session.set_context ('my_context_name', 'var50', 50);
select * from testdata_vw;
UPDATE: Instead of context variables(which can be used across sessions) you can use package variables as you commented.
I have two queries which our application executes, the only difference between the queries is 'count(*)' column (in one query I have it, in the other I don't).
All the queries are dynamically generated, and we provide our software to clients which they run on their databases (we do not have access to their databases). One of the queries is running very slowly (I couldn't get it to finish after waiting for hours). SQL Tuning Advisor is suggesting accepting sql profile which helps, but that means I would have to tell our client to run it, and accept plan. It would be much better if there was an index we could create to speed up the query.
Here's what the query looks like:
select
a.company_id
, count(*)
from
b
INNER JOIN a
ON
b.company_id = a.company_id AND
b.sequence_num = a.sequence_num
INNER JOIN c
ON
b.company_id = c.company_id AND
b.sequence_num = c.sequence_num
INNER JOIN d
ON
c.cash_receipt_num = d.cash_receipt_num
INNER JOIN e
ON
e.code_list_id = 'CONSTANT'
where
(a.company_id='123')
GROUP BY
a.company_id
order by
a.company_id ASC
When the query has 'count()', it runs in about a second. The query without 'count()' runs over several hours before I kill it, so I never got it to finish.
Here is a count of records in each table:
select count(*) from a -- 1,007,948
select count(*) from b -- 148,378
select count(*) from c -- 138,901
select count(*) from d -- 136,424
select count(*) from e -- 1
The returned result should be '123', with a count '908,683' if query has count(*) column.
Here's what the execution plan looks like:
--With count (fast):
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 49 | 6 (0)| 00:00:01 |
| 1 | SORT GROUP BY NOSORT | | 1 | 49 | 6 (0)| 00:00:01 |
| 2 | NESTED LOOPS | | 1 | 49 | 6 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 1 | 39 | 4 (0)| 00:00:01 |
| 4 | NESTED LOOPS | | 1 | 33 | 4 (0)| 00:00:01 |
| 5 | NESTED LOOPS | | 1 | 23 | 3 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | e_KEY00 | 1 | 7 | 1 (0)| 00:00:01 |
|* 7 | TABLE ACCESS FULL| c | 2 | 32 | 2 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | b_KEY00 | 1 | 10 | 1 (0)| 00:00:01 |
|* 9 | INDEX UNIQUE SCAN | d_KEY00 | 1 | 6 | 0 (0)| 00:00:01 |
|* 10 | INDEX RANGE SCAN | a_KEY00 | 1 | 10 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$3FA9081A
6 - SEL$3FA9081A / e#SEL$4
7 - SEL$3FA9081A / c#SEL$2
8 - SEL$3FA9081A / b#SEL$1
9 - SEL$3FA9081A / d#SEL$3
10 - SEL$3FA9081A / a#SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("e"."CODE_LIST_ID"='CONSTANT')
7 - filter("c"."COMPANY_ID"='123')
8 - access("b"."COMPANY_ID"='123' AND
"b"."SEQUENCE_NUM"="c"."SEQUENCE_NUM")
9 - access("c"."CASH_RECEIPT_NUM"="d"."CASH_RECEIPT_NUM")
10 - access("a"."COMPANY_ID"='123' AND
"b"."SEQUENCE_NUM"="a"."SEQUENCE_NUM")
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) '123'[3], COUNT(*)[22]
2 - (#keys=0)
3 - (#keys=0) "b"."SEQUENCE_NUM"[NUMBER,22]
4 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22],
"b"."SEQUENCE_NUM"[NUMBER,22]
5 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22],
"c"."SEQUENCE_NUM"[NUMBER,22]
7 - "c"."CASH_RECEIPT_NUM"[NUMBER,22],
"c"."SEQUENCE_NUM"[NUMBER,22]
8 - "b"."SEQUENCE_NUM"[NUMBER,22]
-- without count (slow)
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 49 | 6 (0)| 00:00:01 |
| 1 | SORT GROUP BY NOSORT | | 1 | 49 | 6 (0)| 00:00:01 |
| 2 | NESTED LOOPS SEMI | | 1 | 49 | 6 (0)| 00:00:01 |
| 3 | NESTED LOOPS SEMI | | 1 | 43 | 6 (0)| 00:00:01 |
| 4 | NESTED LOOPS | | 1 | 33 | 5 (0)| 00:00:01 |
| 5 | NESTED LOOPS | | 1 | 23 | 3 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | e_KEY00 | 1 | 7 | 1 (0)| 00:00:01 |
|* 7 | TABLE ACCESS FULL | c | 2 | 32 | 2 (0)| 00:00:01 |
|* 8 | INDEX FAST FULL SCAN| a_KEY00 | 2 | 20 | 1 (0)| 00:00:01 |
|* 9 | INDEX RANGE SCAN | b_KEY00 | 139K| 1366K| 1 (0)| 00:00:01 |
|* 10 | INDEX UNIQUE SCAN | d_KEY00 | 136K| 799K| 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$3FA9081A
6 - SEL$3FA9081A / e#SEL$4
7 - SEL$3FA9081A / c#SEL$2
8 - SEL$3FA9081A / a#SEL$1
9 - SEL$3FA9081A / b#SEL$1
10 - SEL$3FA9081A / d#SEL$3
Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("e"."CODE_LIST_ID"='CONSTANT')
7 - filter("d"."COMPANY_ID"='123')
8 - filter("a"."COMPANY_ID"='123')
9 - access("b"."COMPANY_ID"='123' AND
"b"."SEQUENCE_NUM"="a"."SEQUENCE_NUM")
filter("b"."SEQUENCE_NUM"="c"."SEQUENCE_NUM")
10 - access("c"."CASH_RECEIPT_NUM"="d"."CASH_RECEIPT_NUM")
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) '123'[3]
2 - (#keys=0)
3 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22]
4 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22],
"c"."SEQUENCE_NUM"[NUMBER,22],
"a"."SEQUENCE_NUM"[NUMBER,22]
5 - (#keys=0) "c"."CASH_RECEIPT_NUM"[NUMBER,22],
"c"."SEQUENCE_NUM"[NUMBER,22]
7 - "c"."CASH_RECEIPT_NUM"[NUMBER,22],
"c"."SEQUENCE_NUM"[NUMBER,22]
8 - "a"."SEQUENCE_NUM"[NUMBER,22]
I'm suspecting the issue is with statistics. I tried running the following:
begin
DBMS_STATS.GATHER_SCHEMA_STATS (
ownname => 'owner_of_tables_here',
estimate_percent => 100
);
end;
EXEC dbms_stats.gather_database_stats;
EXEC dbms_stats.gather_database_stats(estimate_percent => 100, block_sample => FALSE, method_opt => 'FOR ALL COLUMNS', granularity => 'ALL', cascade => TRUE, options => 'GATHER');
-- for each index mentioned in explain plan:
EXEC DBMS_STATS.GATHER_INDEX_STATS(ownname => 'owner_of_tables_here', indname => 'index name here', estimate_percent => 100)
-- for each of the five tables:
EXEC DBMS_STATS.GATHER_TABLE_STATS(ownname => 'owner_of_tables_here', tabname => 'table name here', estimate_percent => 100, block_sample => FALSE, method_opt => 'FOR ALL COLUMNS', granularity => 'ALL', cascade => TRUE)
Am I missing something? would client have to run sql tuning advisor and accept the suggested sql profile?
Oracle version: 12.1.0.2.0
Explanation for why the query is the way it is: The application from which this query is taken, allows user to select columns from UI. For example if client only wants to see all companies they have access to, then the query above runs, and if they want to see all companies and how many records there are per each company, then the count(*) query executes. The reason there's a "where company_id = '123' ", is because this particular user only has permissions to see one company, however a different user may have permissions to see all or multiple companies, in which case the dynamically generated filter would be different. (I understand the query looks very odd the way it is, but usually query would have a lot of columns, and no 'group by' clause - which runs fast actually.)
With some speculation as I don't have your exact data:
Your column COMPANY_ID in table A is probably skewed. The table has 1M rows with more than 900K rows with company_id = '123'
Check first the execution plan of a simplified query
select * from a where company_id = '123'
If it is showing some unrealistic low value such as 1 or 2 check if the column COMPANY_ID has a histogram
select HISTOGRAM from user_tab_columns where table_name = 'A' and COLUMN_NAME = 'COMPANY_ID';
I expect no.
Gather histogram for this column e.g. with
exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'a',granularity=>'all',method_opt=>'FOR COLUMNS COMPANY_ID',estimate_percent => 100,cascade=>TRUE);
Check execution plan of a simplified query
select * from a where company_id = '123'
This should show some 900K rows. I hope this correct cardinality will prevent the use the table A in a wrong position (as in the slow plan).