How do I populate the same data into multiple rows, if the employee id is the same. Without querying the table every time
E.g.
If I get below rows from the Employee Table
EMPLID CHANGETIME
------ --------------
1234 8/10/2017
1234 8/11/2017
For the above employee I need to query the NAME table to get the names and populate both rows.
EMPLID CHANGETIME FirstNAME LastNAME
------ ---------- --------- --------
1234 08/10/17 JOHN MATHEW
1234 08/11/17 JOHN MATHEW
When I query first time would like to store it in array or some variable and populate the same if EMPLID matches previous one.
Just want to do this to improve performance. Any hint would be helpful.
Right now I'm using bulk insert into type table and it goes and searches the NAME table every time a row is fetched from EMPLOYEE table
I would use a join for getting the employee name (also within pl/sql) like:
SELECT e.emplid, e.first_name, e.last_name, c.changetime
FROM employee_changes c
INNER JOIN employee e ON e.emplid = c.emplid
WHERE c.change_time > sysdate - 30
ORDER BY e.emplid, c.change_time
Select can be used as cursor if you want to ...
I think you need some extra criteria, like "the last change time".
In that case you can code something like this:
SELECT e.EMPLID, e.CHANGETIME, n.FirstNAME, n.LastNAME
FROM Employee e, NAME n
WHERE e.emplid = n.emplid
AND e.changetime = (SELECT MAX(e1.changetime)
FROM Employee e1
WHERE e1.emplid = e.emplid);
"For each Employee, just get de max changetime"
If you are using 11g you should consider using the Result Cache feature. This allows us to define functions whose returned values are stored in memory. Something like this:
create or replace function get_name
(p_empid pls_integer)
return varchar2
result_cache relies_on (names)
is
return_value varchar2(30);
begin
select empname into return_value
from names
where empid = p_empid;
return return_value;
end get_name;
/
Note the RELIES_ON clause: this is option but it makes the point that caching is only useful for slowly changing tables. If there's a lot of churn in the NAMES table Oracle will keep flushing the cache and you won't get much benefit. But at least the results will be correct - something which you can't guarantee with your current approach.
Here is a sample. Ignore the elapsed times, but mark how little effort is required to get the same employee name on the second call:
SQL> set autotrace on
SQL> set timing on
SQL> select get_name(7934) from dual
2 /
GET_NAME(7934)
------------------------------------------------------------------------------------------------------------------------------------------------------
MILLER
Elapsed: 00:00:01.25
Execution Plan
----------------------------------------------------------
Plan hash value: 1388734953
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------
Statistics
----------------------------------------------------------
842 recursive calls
166 db block gets
1074 consistent gets
44 physical reads
30616 redo size
499 bytes sent via SQL*Net to client
437 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
134 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
Second call:
SQL> r
1* select get_name(7934) from dual
GET_NAME(7934)
------------------------------------------------------------------------------------------------------------------------------------------------------
MILLER
Elapsed: 00:00:00.13
Execution Plan
----------------------------------------------------------
Plan hash value: 1388734953
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
499 bytes sent via SQL*Net to client
437 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
The documentation has lots on this nifty feature. Find out more.
Related
How to tell which oracle plan is good when comparing different queries which produce same number of rows ?
If I have to consider last_consistent_gets to be low, I see the elapsed time is more.
And for other query elapsed time is less but last_consistent_gets are more.
It’s very confusing.
The elapsed time is usually the most important metric for Oracle performance. In theory, we may occasionally want to sacrifice the run time of one SQL statement to preserve resources for other statements. In practice, those situations are rare.
In your specific case, there are many times when a statement that consumes more consistent gets is both faster and more efficient. For example, when retrieving a large percentage of data from a table, a full table scan is often more efficient than an index scan. A full table scan can use a multi-block read, which can be much more efficient than the multiple single-block reads of an index scan. Storage systems generally are much faster at reading large chunks of data than multiple small chunks.
The below example compares reading 25% of the data from a table. The index approach uses only half as many consistent gets, but it is also more than twice as slow.
Sample Schema
Create a simple table and index and gather stats.
create table test1(a number, b number);
insert into test1 select level, level from dual connect by level <= 1000000;
create index test1_ids on test1(a);
begin
dbms_stats.gather_table_stats(user, 'TEST1');
end;
/
Autotrace
The code below shows the full table scan consumes 2082 consistent gets and forcing an index access consumes 1078 consistent gets.
JHELLER#orclpdb> set autotrace on;
JHELLER#orclpdb> set linesize 120;
JHELLER#orclpdb> select sum(b) from test1 where a >= 750000;
SUM(B)
----------
2.1875E+11
Execution Plan
----------------------------------------------------------
Plan hash value: 3896847026
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 10 | 597 (3)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 10 | | |
|* 2 | TABLE ACCESS FULL| TEST1 | 250K| 2441K| 597 (3)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A">=750000)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2082 consistent gets
0 physical reads
0 redo size
552 bytes sent via SQL*Net to client
404 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
JHELLER#orclpdb> select /*+ index(test1) */ sum(b) from test1 where a >= 750000;
SUM(B)
----------
2.1875E+11
Execution Plan
----------------------------------------------------------
Plan hash value: 1247966541
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 10 | 1084 (1)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 10 | | |
| 2 | TABLE ACCESS BY INDEX ROWID BATCHED| TEST1 | 250K| 2441K| 1084 (1)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | TEST1_IDS | 250K| | 563 (1)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("A">=750000)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1078 consistent gets
0 physical reads
0 redo size
552 bytes sent via SQL*Net to client
424 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Performance
If you run the statements a hundred times in a loop (and run those loops multiple times to ignore caching and other system activity), the full table scan version runs much faster than the forced index scan version.
--Seconds to run plan with more consistent gets: 1.7, 1.7, 1.8
declare
v_count number;
begin
for i in 1 .. 100 loop
select sum(b) into v_count from test1 where a >= 750000;
end loop;
end;
/
--Seconds to run plan with less consistent gets: 4.5, 4,5, 4.5
declare
v_count number;
begin
for i in 1 .. 100 loop
select /*+ index(test1) */ sum(b) into v_count from test1 where a >= 750000;
end loop;
end;
/
Exceptions
There are some times when resource consumption is more important than elapsed time. For example, parallelism is kind of cheating in that it forces the system to work harder, not smarter. A single out-of-control parallel query can take down an entire system. There are also times when you need to break up statements into less efficient versions to decrease the amount of time something is locked, or to avoid consuming too much UNDO or temporary tablespace.
But the above examples are somewhat uncommon exceptions, and they generally only happen when dealing with data warehouses that query a large amount of data. For most OLTP systems, where every query takes less than a second, the elapsed time is the only metric you need to worry about.
I have got an audit table as below.
create table "AUDIT_LOG"
(
"AUDIT_ID" NVARCHAR2(70),
"PAYMENT_IDENTIFICATION_ID" NVARCHAR2(70),
"ACCOUNT_NUMBER" NVARCHAR2(100)
PRIMARY KEY ("AUDIT_ID")
);
I have below index
payment_idx on ("PAYMENT_IDENTIFICATION_ID")
payment_id_idx on ("PAYMENT_IDENTIFICATION_ID", "AUDIT_ID")
system_index on primary key AUDIT_ID
Below are query i am be using
Query1 :
Select * FROM
AUDIT_LOG
WHERE
PAYMENT_IDENTIFICATION_ID =
'ID124'
AND
AUDIT_ID<>'ecfdc2c3-87eb-48c9-b53c';
Query2 :
Select * FROM
AUDIT_LOG
WHERE
PAYMENT_IDENTIFICATION_ID =
'ID124'
AND
AUDIT_ID='ecfdc2c3-87eb-48c9-b53c';
First query explain plan show the usage of index payment_id_idx having option BY INDEX ROWID BATCHED.
However second query explain plan showing usage of system_index on primary key AUDIT_ID with option BY INDEX ROWID BATCHED.
I was of the opinion that in the both query index payment_id_idx should be used.
Any idea why second query is not using composite index payment_id_idx.
Any help is much appreciated.
Let's try to simulate a scenario similar to yours.
SQL> alter session set current_schema=test ;
Session altered.
SQL> create table "AUDIT_LOG"
(
"AUDIT_ID" NVARCHAR2(70),
"PAYMENT_IDENTIFICATION_ID" NVARCHAR2(70),
"ACCOUNT_NUMBER" NVARCHAR2(100)
); 2 3 4 5 6
Table created.
SQL> alter table audit_log add primary key ( audit_id ) ;
Table altered.
SQL> create index payment_idx on audit_log ("PAYMENT_IDENTIFICATION_ID");
Index created.
SQL> create index payment_id_idx on audit_log ("PAYMENT_IDENTIFICATION_ID", "AUDIT_ID");
Index created.
Now let's insert some demo data, but following some considerations:
AUDIT_ID is unique in the form of IDxxx ( where xxx takes values from 1 to 1M )
PAYMENT_IDENTIFICATION_ID takes 10 distinct values in the form of LPAD and a letter. The idea here is to generate 10 distinct values
ACCOUNT_NUMBER is a random string of one letter and one letter in lpad to fill up 70 characters.
Thus
declare
begin
for i in 1 .. 1000000
loop
insert into audit_log values
( 'ID'||i||'' ,
case when i between 1 and 100000 then lpad('A',50,'A')
when i between 100001 and 200000 then lpad('B',50,'B')
when i between 200001 and 300000 then lpad('C',50,'C')
when i between 300001 and 400000 then lpad('D',50,'D')
when i between 400001 and 500000 then lpad('E',50,'E')
when i between 500001 and 600000 then lpad('F',50,'F')
when i between 600001 and 700000 then lpad('G',50,'G')
when i between 700001 and 800000 then lpad('H',50,'H')
when i between 800001 and 900000 then lpad('I',50,'I')
when i between 900001 and 1000000 then lpad('J',50,'J')
end ,
lpad(dbms_random.string('U',1),70,'B')
);
end loop;
commit;
end;
/
First Query
SQL> set autotrace traceonly lines 220 pages 400
SQL> Select * FROM
AUDIT_LOG
WHERE
PAYMENT_IDENTIFICATION_ID = 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
AND
AUDIT_ID <> 'ID123482'; 2 3 4 5 6
100000 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 272803615
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 20M| 3767 (1)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| AUDIT_LOG | 100K| 20M| 3767 (1)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | PAYMENT_IDX | 100K| | 1255 (1)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("AUDIT_ID"<>U'ID123482')
2 - access("PAYMENT_IDENTIFICATION_ID"=U'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAA')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
16982 consistent gets
2630 physical reads
134596 redo size
12971296 bytes sent via SQL*Net to client
73843 bytes received via SQL*Net from client
6668 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
100000 rows processed
Second query
SQL> set autotrace traceonly lines 220 pages 400
SQL> Select * FROM
AUDIT_LOG
WHERE
PAYMENT_IDENTIFICATION_ID = 'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF'
AND
AUDIT_ID ='ID578520'; 2 3 4 5 6
Execution Plan
----------------------------------------------------------
Plan hash value: 303326437
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 219 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| AUDIT_LOG | 1 | 219 | 3 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | SYS_C0076603 | 1 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("PAYMENT_IDENTIFICATION_ID"=U'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFF')
2 - access("AUDIT_ID"=U'ID578520')
Statistics
----------------------------------------------------------
9 recursive calls
6 db block gets
9 consistent gets
7 physical reads
1080 redo size
945 bytes sent via SQL*Net to client
515 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
The predicate information gives you a lot of information regarding the access paths:
In the first query:
1 - filter("AUDIT_ID"<>U'ID123482')
2 - access("PAYMENT_IDENTIFICATION_ID"=U'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAA')
The access is determined by the "=" operator, and in this case a range scan of the index PAYMENT_IDX is the best approach. The filter comes for all the rows that match the access condition, filter those which are <> from the value in AUDIT_ID.
In the second query:
1 - filter("PAYMENT_IDENTIFICATION_ID"=U'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFF')
2 - access("AUDIT_ID"=U'ID578520')
The access is by the primary key index, as you are using = as operator, so there is no better way to find the row that using the PK index. That is why you have an INDEX_UNIQUE_SCAN. The filter comes from the table access, as Oracle has already determined the row from the unique primary key index. Actually, that condition is not necessary as unless you look for 1 or no rows.
As in the first query, you are making a <> from the primary key index, Oracle will use the other index. assuming ( like in the example ) that you have very few distinct values. Keep in mind that in case it were to use the PK index, it would retrieve 999999 rows in the first step, then applying the filter, which is far less efficient than using the second index.
If you force the CBO to use the PK index, you can see it
SQL> Select /*+INDEX(a,SYS_C0076603) */ * FROM
AUDIT_LOG a
WHERE
PAYMENT_IDENTIFICATION_ID = 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
AND
AUDIT_ID <> 'ID123482'; 2 3 4 5 6
100000 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3265638686
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 20M| 207K (1)| 00:00:17 |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| AUDIT_LOG | 100K| 20M| 207K (1)| 00:00:17 |
|* 2 | INDEX FULL SCAN | SYS_C0076603 | 999K| | 3212 (1)| 00:00:01 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("PAYMENT_IDENTIFICATION_ID"=U'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AA')
2 - filter("AUDIT_ID"<>U'ID123482')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
218238 consistent gets
18520 physical reads
1215368 redo size
12964630 bytes sent via SQL*Net to client
73873 bytes received via SQL*Net from client
6668 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
100000 rows processed
I am inserting 10000 rows into an Oracle OLTP table every 30 seconds. This about 240Mb of data every half an hour. All 10000 rows have the same timestamp which I floor to a 30 second boundary. I also have 3 indexes one of which is a spatial point geometry index (latitude and longitude). The timestamp is also indexed.
During a test the 2 CPUs showed 50% utilization and Input/Output showed 80% with inserts doubling in duration after a half an hour.
I also select from the table to get the last inserted timestamp 10000 rows by using a sub-query to find the maximum timestamp, due to this being two different processes (Python for inserts and google maps for select). I tried to employ a strategy whereby I tried to use the current time to retrieve the last 10000 rows but I could not get it to work even when go for the before last 10000 rows. It often returned no rows.
My question is how can I retrieve the last inserted 10000 rows efficiently and what type of index and/or table would be most appropriate where all 10000 rows have the same timestamp value. Keeping the insert time low and it not doubling in duration would however be of more importance, so not sure whether a history table is needed in addition while only keeping the last row in the current table; but surely that will double the amount of IO which seems to be the biggest issue currently. Any advice will be appreciated.
The database can "walk" down the "right hand side" of an index to very quickly get the maximum value. Here's an example
SQL> create table t ( ts date not null, x int, y int, z int );
Table created.
SQL>
SQL> begin
2 for i in 1 .. 100
3 loop
4 insert into t
5 select sysdate, rownum, rownum, rownum
6 from dual
7 connect by level <= 10000;
8 commit;
9 end loop;
10 end;
11 /
PL/SQL procedure successfully completed.
SQL>
SQL> create index ix on t (ts );
Index created.
SQL>
SQL> set autotrace on
SQL> select max(ts) from t;
MAX(TS)
---------
12-JUN-20
1 row selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1223533863
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 9 | | |
| 2 | INDEX FULL SCAN (MIN/MAX)| IX | 1 | 9 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
Statistics
----------------------------------------------------------
6 recursive calls
0 db block gets
92 consistent gets
8 physical reads
0 redo size
554 bytes sent via SQL*Net to client
383 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
So 92 consistent gets is pretty snappy...However, you can probably go better by jumping the very last leaf block with a descending index read, eg
SQL> select *
2 from (
3 select ts from t order by ts desc
4 )
5 where rownum = 1;
TS
---------
12-JUN-20
1 row selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3852867534
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 3 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 1184K| 10M| 3 (0)| 00:00:01 |
| 3 | INDEX FULL SCAN DESCENDING| IX | 1184K| 10M| 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM=1)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
Statistics
----------------------------------------------------------
9 recursive calls
5 db block gets
9 consistent gets
0 physical reads
1024 redo size
549 bytes sent via SQL*Net to client
430 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
So your current index is fine. Simply get the highest timestamp as per above and you're good to go
This query when executed alone it takes 1 second to executed when the same query is executed through procedure it is taking 20 seconds, please help me on this
SELECT * FROM
(SELECT TAB1.*,ROWNUM ROWNUMM FROM
(SELECT wh.workitem_id, wh.workitem_priority, wh.workitem_type_id, wt.workitem_type_nm,
wh.workitem_status_id, ws.workitem_status_nm, wh.analyst_group_id,
ag.analyst_group_nm, wh.owner_uuid, earnings_estimate.pr_get_name_from_uuid(owner_uuid) owner_name,
wh.create_user_id, earnings_estimate.pr_get_name_from_uuid( wh.create_user_id) create_name, wh.create_ts,
wh.update_user_id,earnings_estimate.pr_get_name_from_uuid(wh.update_user_id) update_name, wh.update_ts, wh.bb_ticker_id, wh.node_id,
wh.eqcv_analyst_uuid, earnings_estimate.pr_get_name_from_uuid( wh.eqcv_analyst_uuid) eqcv_analyst_name,
WH.WORKITEM_NOTE,Wh.PACKAGE_ID ,Wh.COVERAGE_STATUS_NUM ,CS.COVERAGE_STATUS_CD ,Wh.COVERAGE_REC_NUM,I.INDUSTRY_CD INDUSTRY_CODE,I.INDUSTRY_NM
INDUSTRY_NAME,WOT.WORKITEM_OUTLIER_TYPE_NM as WORKITEM_SUBTYPE_NM
,count(1) over() AS total_count,bro.BB_ID BROKER_BB_ID,bro.BROKER_NM BROKER_NAME, wh.assigned_analyst_uuid,earnings_estimate.pr_get_name_from_uuid(wh.assigned_analyst_uuid)
assigned_analyst_name
FROM earnings_estimate.workitem_type wt,
earnings_estimate.workitem_status ws,
earnings_estimate.workitem_outlier_type wot,
(SELECT * FROM (
SELECT WH.ASSIGNED_ANALYST_UUID,WH.DEFERRED_TO_DT,WH.WORKITEM_NOTE,WH.UPDATE_USER_ID,EARNINGS_ESTIMATE.PR_GET_NAME_FROM_UUID(WH.UPDATE_USER_ID)
UPDATE_NAME, WH.UPDATE_TS,WH.OWNER_UUID, EARNINGS_ESTIMATE.PR_GET_NAME_FROM_UUID(OWNER_UUID)
OWNER_NAME,WH.ANALYST_GROUP_ID,WH.WORKITEM_STATUS_ID,WH.WORKITEM_PRIORITY,EARNINGS_ESTIMATE.PR_GET_NAME_FROM_UUID( WI.CREATE_USER_ID) CREATE_NAME, WI.CREATE_TS,
wi.create_user_id,wi.workitem_type_id,wi.workitem_id,RANK() OVER (PARTITION BY WH.WORKITEM_ID ORDER BY WH.CREATE_TS DESC NULLS LAST, ROWNUM) R,
wo.bb_ticker_id, wo.node_id,wo.eqcv_analyst_uuid,
WO.PACKAGE_ID ,WO.COVERAGE_STATUS_NUM ,WO.COVERAGE_REC_NUM,
wo.workitem_outlier_type_id
FROM earnings_estimate.workitem_history wh
JOIN EARNINGS_ESTIMATE.workitem_outlier wo
ON wh.workitem_id=wo.workitem_id
JOIN earnings_estimate.workitem wi
ON wi.workitem_id=wo.workitem_id
AND WI.WORKITEM_TYPE_ID=3
and wh.workitem_status_id not in (1,7)
WHERE ( wo.bb_ticker_id IN (SELECT
column_value from table(v_tickerlist) )
)
)wh
where r=1
AND DECODE(V_DATE_TYPE,'CreatedDate',WH.CREATE_TS,'LastModifiedDate',WH.UPDATE_TS) >= V_START_DATE
AND decode(v_date_type,'CreatedDate',wh.create_ts,'LastModifiedDate',wh.update_ts) <= v_end_date
and decode(wh.owner_uuid,null,-1,wh.owner_uuid)=decode(v_analyst_id,null,decode(wh.owner_uuid,null,-1,wh.owner_uuid),v_analyst_id)
) wh,
earnings_estimate.analyst_group ag,
earnings_estimate.coverage_status cs,
earnings_estimate.research_document rd,
( SELECT
BB.BB_ID ,
BRK.BROKER_ID,
BRK.BROKER_NM
FROM EARNINGS_ESTIMATE.BROKER BRK,COMMON.BB_ID BB
WHERE BRK.ORG_ID = BB.ORG_ID
AND BRK.ORG_LOC_REC_NUM = BB.ORG_LOC_REC_NUM
AND BRK.primary_broker_ind='Y') bro,
earnings_estimate.industry i
WHERE wh.analyst_group_id = ag.analyst_group_id
AND wh.workitem_status_id = ws.workitem_status_id
AND wh.workitem_type_id = wt.workitem_type_id
AND wh.coverage_status_num=cs.coverage_status_num
AND wh.workitem_outlier_type_id=wot.workitem_outlier_type_id
AND wh.PACKAGE_ID=rd.PACKAGE_ID(+)
AND rd.industry_id=i.industry_id(+)
AND rd.BROKER_BB_ID=bro.BB_ID(+)
ORDER BY wh.create_ts)tab1 )
;
I agree that the problem is most likely related to SELECT column_value from table(v_tickerlist).
By default, Oracle estimates that table functions return 8168 rows. Since you're testing the query with a single value, I assume that the actual number of values is usually much smaller. Cardinality estimates, like any forecast, are always wrong. But they should at least be in the ballpark of the actual cardinality for the optimizer to do its job properly.
You can force Oracle to always check the size with dynamic sampling. This will require more time to generate the plan, but it will probably be worth it in this case.
For example:
SQL> --Sample type
SQL> create or replace type v_tickerlist is table of number;
2 /
Type created.
SQL> --Show explain plans
SQL> set autotrace traceonly explain;
SQL> --Default estimate is poor. 8168 estimated, versus 3 actual.
SQL> SELECT column_value from table(v_tickerlist(1,2,3));
Execution Plan
----------------------------------------------------------
Plan hash value: 1748000095
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8168 | 16336 | 16 (0)| 00:00:01 |
| 1 | COLLECTION ITERATOR CONSTRUCTOR FETCH| | 8168 | 16336 | 16 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
SQL> --Estimate is perfect when dynamic sampling is used.
SQL> SELECT /*+ dynamic_sampling(tickerlist, 2) */ column_value
2 from table(v_tickerlist(1,2,3)) tickerlist;
Execution Plan
----------------------------------------------------------
Plan hash value: 1748000095
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 6 | 6 (0)| 00:00:01 |
| 1 | COLLECTION ITERATOR CONSTRUCTOR FETCH| | 3 | 6 | 6 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
SQL>
If that doesn't help, look at your explain plan (and post it here). Find where the cardinality estimate is most wrong, then try to figure out why that is.
Your query is too big and will take time when executing on bulk data. Try putting few de-normalised temp tables, extract the data there and then join between the temp tables. That will increase the performance.
With this stand alone query, do not pass any variable inside the subqueries as in the below line...
WHERE ( wo.bb_ticker_id IN (SELECT
column_value from table(v_tickerlist)
Also, the outer joins will toss the performance.. Better to implement the denormalised temp tables
Is there a way to find if a a particular Oracle index was ever used by Oracle when executing a query?
We have a function based index, which I suspect is not getting used by Oracle and hence some queries are running slow. How could I find out if any query run against the database is using this query?
If the question is : if there are any queries that ever use the index?
ALTER INDEX myindex MONITORING USAGE;
Wait a few days/months/years:
SELECT *
FROM v$object_usage
WHERE index_name = 'MYINDEX';
http://docs.oracle.com/cd/B28359_01/server.111/b28310/indexes004.htm#i1006905
If you're using some sort of IDE (e.g. Oracle's SQL Developer, PL/SQL Developer from Allround Automations, Toad, etc) each one of them has some way to dump the plan for a statement - poke around in the menus and the on-line help.
If you can get into SQL*Plus (try typing "sql" at your friendly command line) you can turn autotrace on, execute your statement, and the plan should be printed. As in
SQL> set autotrace on
SQL> select * from dept where deptno = 40;
DEPTNO DNAME LOC
---------- -------------- -------------
40 OPERATIONS BOSTON
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=18)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'DEPT' (Cost=1 Card=1 Bytes=18)
2 1 INDEX (UNIQUE SCAN) OF 'PK_DEPT' (UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
499 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
This assumes that your friendly neighborhood DBA has performed the necessary incantations to enable this feature. If this hasn't been done, or you just want One More Way (tm) to do this, try something like the following, substituting the query you care about:
SQL> EXPLAIN PLAN FOR select * from dept where deptno = 40;
Explained.
SQL> set linesize 132
SQL> SELECT * FROM TABLE( dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 2852011669
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 20 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 20 | 1 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("DEPTNO"=40)
14 rows selected.
Share and enjoy.