Oracle - Understanding the no_index hint - oracle

I'm trying to understand how no_index actually speeds up a query and haven't been able to find documentation online to explain it.
For example I have this query that ran extremely slow
select *
from <tablename>
where field1_ like '%someGenericString%' and
field1_ <> 'someSpecificString' and
Action_='_someAction_' and
Timestamp_ >= trunc(sysdate - 2)
And one of our DBAs was able to speed it up significantly by doing this
select /*+ NO_INDEX(TAB_000000000019) */ *
from <tablename>
where field1_ like '%someGenericString%' and
field1_ <> 'someSpecificString' and
Action_='_someAction_' and
Timestamp_ >= trunc(sysdate - 2)
And I can't figure out why? I would like to figure out why this works so I can see if I can apply it to another query (this one a join) to speed it up because it's taking even longer to run.
Thanks!
** Update **
Here's what I know about the table in the example.
It's a 'partitioned table'
TAB_000000000019 is the table not a column in it
field1 is indexed

Oracle's optimizer makes judgements on how best to run a query, and to do this it uses a large number of statistics gathered about the tables and indexes. Based on these stats, it decides whether or not to use an index, or to just do a table scan, for example.
Critically, these stats are not automatically up-to-date, because they can be very expensive to gather. In cases where the stats are not up to date, the optimizer can make the "wrong" decision, and perhaps use an index when it would actually be faster to do a table scan.
If this is known by the DBA/developer, they can give hints (which is what NO_INDEX is) to the optimizer, telling it not to use a given index because it's known to slow things down, often due to out-of-date stats.
In your example, TAB_000000000019 will refer to an index or a table (I'm guessing an index, since it looks like an auto-generated name).
It's a bit of a black art, to be honest, but that's the gist of it, as I understand things.
Disclaimer: I'm not a DBA, but I've dabbled in that area.

Per your update: If field1 is the only indexed field, then the original query was likely doing a fast full scan on that index (i.e. reading through every entry in the index and checking against the filter conditions on field1), then using those results to find the rows in the table and filter on the other conditions. The conditions on field1 are such that an index unique scan or range scan (i.e. looking up specific values or ranges of values in the index) would not be possible.
Likely the optimizer chose this path because there are two filter predicates on field1. The optimizer would calculate estimated selectivity for each of these and then multiply them to determine their combined selectivity. But in many cases this will significantly underestimate the number of rows that will match the condition.
The NO_INDEX hint eliminates this option from the optimizer's consideration, so it essentially goes with the plan it thinks is next best -- possibly in this case using partition elimination based on one of the other filter conditions in the query.

Using an index degrades query performance if it results in more disk IO compared to querying the table with an index.
This can be demonstrated with a simple table:
create table tq84_ix_test (
a number(15) primary key,
b varchar2(20),
c number(1)
);
The following block fills 1 Million records into this table. Every 250th record is filled with a rare value in column b while all the others are filled with frequent value:
declare
rows_inserted number := 0;
begin
while rows_inserted < 1000000 loop
if mod(rows_inserted, 250) = 0 then
insert into tq84_ix_test values (
-1 * rows_inserted,
'rare value',
1);
rows_inserted := rows_inserted + 1;
else
begin
insert into tq84_ix_test values (
trunc(dbms_random.value(1, 1e15)),
'frequent value',
trunc(dbms_random.value(0,2))
);
rows_inserted := rows_inserted + 1;
exception when dup_val_on_index then
null;
end;
end if;
end loop;
end;
/
An index is put on the column
create index tq84_index on tq84_ix_test (b);
The same query, but once with index and once without index, differ in performance. Check it out for yourself:
set timing on
select /*+ no_index(tq84_ix_test) */
sum(c)
from
tq84_ix_test
where
b = 'frequent value';
select /*+ index(tq84_ix_test tq84_index) */
sum(c)
from
tq84_ix_test
where
b = 'frequent value';
Why is it? In the case without the index, all database blocks are read, in sequential order. Usually, this is costly and therefore considered bad. In normal situation, with an index, such a "full table scan" can be reduced to reading say 2 to 5 index database blocks plus reading the one database block that contains the record that the index points to. With the example here, it is different altogether: the entire index is read and for (almost) each entry in the index, a database block is read, too. So, not only is the entire table read, but also the index. Note, that this behaviour would differ if c were also in the index because in that case Oracle could choose to get the value of c from the index instead of going the detour to the table.
So, to generalize the issue: if the index does not pick few records then it might be beneficial to not use it.

Something to note about indexes is that they are precomputed values based on the row order and the data in the field. In this specific case you say that field1 is indexed and you are using it in the query as follows:
where field1_ like '%someGenericString%' and
field1_ <> 'someSpecificString'
In the query snippet above the filter is on both a variable piece of data since the percent (%) character cradles the string and then on another specific string. This means that the default Oracle optimization that doesn't use an optimizer hint will try to find the string inside the indexed field first and also find if the data it is a sub-string of the data in the field, then it will check that the data doesn't match another specific string. After the index is checked the other columns are then checked. This is a very slow process if repeated.
The NO_INDEX hint proposed by the DBA removes the optimizer's preference to use an index and will likely allow the optimizer to choose the faster comparisons first and not necessarily force index comparison first and then compare other columns.
The following is slow because it compares the string and its sub-strings:
field1_ like '%someGenericString%'
While the following is faster because it is specific:
field1_ like 'someSpecificString'
So the reason to use the NO_INDEX hint is if you have comparisons on the index that slow things down. If the index field is compared against more specific data then the index comparison is usually faster.
I say usually because when the indexed field contains more redundant data like in the example #Atish mentions above, it will have to go through a long list of comparison negatives before a positive comparison is returned. Hints produce varying results because both the database design and the data in the tables affect how fast a query performs. So in order to apply hints you need to know if the individual comparisons you hint to the optimizer will be faster on your data set. There are no shortcuts in this process. Applying hints should happen after proper SQL queries have been written because hints should be based on the real data.
Check out this hints reference: http://docs.oracle.com/cd/B19306_01/server.102/b14211/hintsref.htm

To add to what Rene' and Dave have said, this is what I have actually observed in a production situation:
If the condition(s) on the indexed field returns too many matches, Oracle is better off doing a Full Table Scan.
We had a report program querying a very large indexed table - the index was on a region code and the query specified the exact region code, so Oracle CBO uses the index.
Unfortunately, one specific region code accounted for 90% of the tables entries.
As long as the report was run for one of the other (minor) region codes, it completed in less than 30 minutes, but for the major region code it took many hours.
Adding a hint to the SQL to force a full table scan solved the problem.
Hope this helps.

I had read somewhere that using a % in front of query like '%someGenericString%' will lead to Oracle ignoring the INDEX on that field. Maybe that explains why the query is running slow.

Related

after alter system flush shared_pool low performance Oracle

We did refactoring and replaced 2 similar requests with parameterized request
a.isGood = :1
after that request that used this parameter with parameter 'Y' was executed longer that usually (become almost the same with parameter 'N'). We used alter system flush shared_pool command and request for parameter 'Y' has completed fast (as before refactoring) while request with parameter 'N' hangs for a long time.
As you could understand number of lines in data base with parameter 'N' much more then with 'Y'
Oracle 10g
Why it happened?
I assume that you have an index on that column, otherwise the performance would be the same regardless of the Y/N combination. I have seen this happening quite bit on 10g+ due to Oracle's optimizer Bind Peeking combined to histograms on columns with skewed data distribution. The histograms get created automatically when one gathers tables statistics using the parameter method_opt with 'FOR ALL COLUMNS SIZE AUTO' (among other values). Oracle optimizes the query for the value in the bind variables provided in the very first execution of that query. If you run the query with Y the first time, Oracle might want to use an index instead of a full table scan, since Y will return a small quantity of rows. The next time you run the query with N, then Oracle will repeat the first execution plan, which happens to be a poor choice for N, since it will return the vast majority of rows.
The execution plans are cached in the SGA. Once you flush it, you get a brand new execution plan the very first time the query runs again.
My suggestion is:
Obtain the explain plan of both original queries (one with a hardcoded Y and one with a hardcode N). Investigate if the two plans use different indexes or one has a much higher Cost than the other. I have the feeling that one uses a full table scan and the other uses an index. The first one should be faster for N and the second should be faster for Y.
Try to remove the statistics on the table and see if it makes a difference on the query that has the bind variable. Later you need to gather statistics again for the table or other queries on that table might suffer.
You can also gather statistics for that one table using method_opt => FOR ALL COLUMNS SIZE 1. That will keep the statistics without the histograms on any columns of that table.
A bitmap index on this column might fix the issue as well. Indexes on a column that have only two possible values (Y and N) are not exactly very efficient.
If column isGood has 99,000 'N' values and 1,000 'Y' values and you run with the condition isGood = 'Y', then it may be appropriate to use an index to find the results: you are returning 1% of the rows. If you run the query with the condition isGood = 'N', a full table scan would be more appropriate since you are returning most of the table anyway. If you were to use an index for the N condition, you would be doing an extra index lookup for every data item lookup.
Although the general rule is that bind parameters are good, it can be problematic in this kind of instance if really two different plans are required for the query. With the bind parameter scenario:
SELECT * FROM x WHERE isGood = :1
The statement will be parsed and a plan computed and saved in the sql cache. The same plan will be used for both query scenarios which is not desirable. But:
SELECT * FROM x WHERE isGood = 'Y'
SELECT * FROM x WHERE isGood = 'N'
will result in two plans being stored in the sql cache, hopefully each with the appropriate plan for the query. Version 11g avoids this problem with adaptive cursor sharing, which can use different plans for different bind variable values.
You need to look at your plans (EXPLAIN PLAN) to see what is happening in your case. Flush the cache, try one method, examine the plan; try the other, examine the plan. It might give you an idea what is happening in your case. There are a bunch of other topics you might follow up on that may help, for example:
using a hint to force the use of an index
cursor_sharing parameter
histograms on statistics

different columns on select query results different costs

There is an index at table invt_item_d on (item_id & branch_id & co_id) columns.
The plan results for the first query are TABLE ACCESS FULL and cost is 528,
results for the second query are INDEX FAST FULL SCAN (my index) and cost is 27.
The only difference is, as you can see, the selected column is used in index on the second query.
Is there something wrong with this? And please, can you tell me what should I do to fix this at db administration level?
select d.qty
from invt_item_d d
where d.item_id = 999
and d.branch_id = 888
and d.co_id = 777
select d.item_id
from invt_item_d d
where d.item_id = 999
and d.branch_id = 888
and d.co_id = 777
EDIT:
i made a new query and this query's cost is 529, with TABLE ACCESS FULL.
select qty from invt_item_d
so it doesn't matter if i use an index or not. Some says this is normal, is this a normal behaviour really?
In the first case, the table must be accessed, since the "qty" column is only stored in the table.
In the second case, all the columns used in the query can be read from the index, skipping the table read altogether.
You can add another index on columns (item_id, branch_id, co_id, qty) and it will most probably be used in the first query.
From the Oracle documentation: http://docs.oracle.com/cd/E11882_01/server.112/e25789/indexiot.htm
A fast full index scan is a full index scan in which the database
accesses the data in the index itself without accessing the table, and
the database reads the index blocks in no particular order.
Fast full index scans are an alternative to a full table scan when
both of the following conditions are met:
The index must contain all columns needed for the query.
A row containing all nulls must not appear in the query result set. For this result to be guaranteed, at least one column in the
index must have either:
A NOT NULL constraint
A predicate applied to it that prevents nulls from being considered in the query result set
This is exactly the main purpose of using index -- make search faster.
Querying columns with indexes are faster compared to querying columns without indexes.
Its basic oracle knowledge.
I am adding another answer because it seems to be more convinient.
First:
" i doesn't hit the index because there are 34000 rows, not millions". This is COMPLETELY WRONG and a dangerous understanding.
What I meant was, if there are a few thousand rows, and the index is not hit(oracle engine does a full table scan(TABLE ACCESS FULL) then), its not a big deal. Oracle is fast enough to read few thousand rows in a matter of a second(even without indexes) , and hence you wont feel the difference.The query is still slower(than the occasion when there is an index) , but its is so minimally slower that you wont feel the difference.
But, if there are millions of rows, the execution of the query will be much, much slower without index ( as this time it will scan millions of rows in a full table scan)and your performance will be hit.
Second: Why on earth do you have to loop over a table with 34000 rows, that too 4000 times???
Thats a terrible approach. Avoid loops as much as possible.There has to be a better approach!
Third:
You can force the oracle optimiser to hit the index by using the index hint.You will need to know the name of the index for that.
select /*+ index(invt_item_d <index_name>) */
d.qty
from invt_item_d d
where d.item_id = 999
and d.branch_id = 888
and d.co_id = 777
Here is the link to a stack overflow question on index hint

Is it OK to create several indexes on a table if they're really needed

I have a table with 7 columns.
It's going to contain lots and lots of data - something like more than 1.7 million records will be added every month.
Of those 7 columns 5 are the ones that I'll be using in the WHERE clause of my queries against this table in different combinations.
Is it OK to create different indexes for those possible combinations ?
I'm asking this question because if I do that, there'll be more than 10 indexes on this table and I'm not sure if this is a good idea.
On the other hand, I'm afraid of querying a table with this big amount of data without indexes.
Here's the table:
CREATE TABLE AG_PAYMENTS_TO_BE
(
PAYMENTID NUMBER(15, 0) NOT NULL
, DEPARTID NUMBER(3,0)
, PENSIONERID NUMBER(11, 0) NOT NULL
, AMOUNT NUMBER(6, 2)
, PERIOD CHAR(6 CHAR)
, PAYMENTTYPE NUMBER(1,0)
, ST NUMBER(1, 0) DEFAULT 0
, CONSTRAINT AG_PAYMENTS_TO_BE_PK PRIMARY KEY
(
PAYMENTID
)
ENABLE
);
Possible queries:
SELECT AMOUNT FROM AG_PAYMENTS_TO_BE WHERE ST=0 AND DEPARTID=112 AND PERIOD='201207';
SELECT AMOUNT FROM AG_PAYMENTS_TO_BE WHERE ST=0 AND PENSIONERID=123456 AND PERIOD='201207';
SELECT AMOUNT FROM AG_PAYMENTS_TO_BE WHERE ST=0 AND PENSIONERID=123456 AND PERIOD='201207' AND PAYMENTTYPE=1;
SELECT AMOUNT FROM AG_PAYMENTS_TO_BE WHERE ST=0 AND DEPARTID=112 AND ST=0;
SELECT AMOUNT FROM AG_PAYMENTS_TO_BE WHERE ST=0 AND PENSIONERID=123456;
and so on.
Ignoring index skip scans* for the moment, in order for a query to use an index:
The leading index columns must be listed in the query
They must compared using exact joins (i.e. using =, not <,> or like)
For example, a table with a composite index on (a, b) could use the index in the following queries:
a = :b1 and b >= :b2
a = :b1
but not:
b = :b2
because column b is listed second in the index. * In some cases, it's possible for the index to be used in this case via an index skip scan. This is where the leading column in the index is skipped. There needs to be relatively few distinct values for the first column however, which doesn't happen often (in my experience).
Note that a "larger" index can be used by queries which only use some of the leading columns from it. So in the example above, an index on just a is redundant because the queries shown can use the index on a, b. An index on just b may be useful however.
The more indexes you add, the slower your inserts/updates/deletes will be, because the indexes have to be maintained at the same time as the table. Therefore you should aim to keep the number of indexes down, unless there's significant query benefits to adding a new one. This is something you'll have to measure in your environment to determine the exact cost/benefit.
Note that having multiple indexes with similar columns can lead to the wrong index being selected. So there is potential downside for selects when you have many similar indexes. There is also a slight overhead in parse times, as Oracle has more options to consider when selecting the execution plan.
Looking at your queries I believe you only need indexes on:
st, departid, period
st, pensionerid, period
You may wish to add amount at the end of these as well, so your queries can be fully answered from the index, saving you a table lookup. You may also need further indexes if these columns are foreign keys to other tables, to prevent locking issues.
This decision would greatly depend on expected number of distinct values in each column, and thus selectivity of each possible index.
Things I would consider while making decisions:
Obviously, PAYMENTTYPE & ST fields hold up to 10 19 distinct values each, which is pretty unselective if we keep in mind your expected volume of data (~400M rows), so they won't help you much.
However, they probably could become good candidates for list partitioning instead.
I would also think of switching PERIOD CHAR(6 CHAR) to DATE and making a composite range-list partition on period+st/paymenttype.
DEPARTID - If you have hundreds of departments, then it's probably an indexing candidate, but if only dozens - then probably a full scan would perform way faster.
PENSIONERID seems to be a high-selectivity field, so I would consider creating a separate index on it, and including it in a composite index on PERIOD+PENSIONERID (in that field order).
I think you should create a few combined indexes (like ('ST' and 'PERIOD') and ('ST' and 'PENSIONERID'). That will speed up most of your sample queries...

ORACLE db performance tuning

We are running into performance issue where I need some suggestions ( we are on Oracle 10g R2)
The situation is sth like this
1) It is a legacy system.
2) In some of the tables it holds data for the last 10 years ( means data was never deleted since the first version was rolled out). Now in most of the OLTP tables they are having around 30,000,000 - 40,000,000 rows.
3) Search operations on these tables is taking flat 5-6 minutes of time. ( a simple query like select count(0) from xxxxx where isActive=’Y’ takes around 6 minutes of time.) When we saw the explain plan we found that index scan is happening on isActive column.
4) We have suggested archive and purge of the old data which is not needed and team is working towards it. Even if we delete 5 years of data we are left with around 15,000,000 - 20,000,000 rows in the tables which itself is very huge, so we thought of having table portioning on these tables, but we found that the user can perform search of most of the columns of these tables from UI,so which will defeat the very purpose of table partitioning.
so what are the steps which need to be taken to improve this situation.
First of all: question why you are issuing the query select count(0) from xxxxx where isactive = 'Y' in the first place. Nine out of ten times it is a lazy way to check for existence of a record. If that's the case with you, just replace it with a query that select 1 row (rownum = 1 and a first_rows hint).
The number of rows you mention are nothing to be worried about. If your application doesn't perform well when number of rows grows, then your system is not designed to scale. I'd investigate all queries that take too long using a SQL*Trace or ASH and fix it.
By the way: nothing you mentioned justifies the term legacy, IMHO.
Regards,
Rob.
Just a few observations:
I'm guessing that the "isActive" column can have two values - 'Y' and 'N' (or perhaps 'Y', 'N', and NULL - although why in the name of Fred there wouldn't be a NOT NULL constraint on such a column escapes me). If this is the case an index on this column would have very poor selectivity and you might be better off without it. Try dropping the index and re-running your query.
#RobVanWijk's comment about use of SELECT COUNT(*) is excellent. ONLY ask for a row count if you really need to have the count; if you don't need the count, I've found it's faster to do a direct probe (SELECT whatever FROM wherever WHERE somefield = somevalue) with an apprpriate exception handler than it is to do a SELECT COUNT(*). In the case you cited, I think it would be better to do something like
BEGIN
SELECT IS_ACTIVE
INTO strIsActive
FROM MY_TABLE
WHERE IS_ACTIVE = 'Y';
bActive_records_found := TRUE;
EXCEPTION
WHEN NO_DATA_FOUND THEN
bActive_records_found := FALSE;
WHEN TOO_MANY_ROWS THEN
bActive_records_found := TRUE;
END;
As to partitioning - partitioning can be effective at reducing query times IF the field on which the table is partitioned is used in all queries. For example, if a table is partitioned on the TRANSACTION_DATE variable, then for the partitioning to make a difference all queries against this table would have to have a TRANSACTION_DATE test in the WHERE clause. Otherwise the database will have to search each partition to satisfy the query, so I doubt any improvements would be noted.
Share and enjoy.

Oracle 8i date function slow

I'm trying to run the following PL/SQL on an Oracle 8i server (old, I know):
select
-- stuff --
from
s_doc_quote d,
s_quote_item i,
s_contact c,
s_addr_per a,
cx_meter_info m
where
d.row_id = i.sd_id
and d.con_per_id = c.row_id
and i.ship_per_addr_id = a.row_id(+)
and i.x_meter_info_id = m.row_id(+)
and d.x_move_type in ('Move In','Move Out','Move Out / Move In')
and i.prod_id in ('1-QH6','1-QH8')
and d.created between add_months(trunc(sysdate,'MM'), -1) and sysdate
;
Execution is incredibly slow however. Because the server is taken down around midnight each night, it often fails to complete in time.
The execution plan is as follows:
SELECT STATEMENT 1179377
NESTED LOOPS 1179377
NESTED LOOPS OUTER 959695
NESTED LOOPS OUTER 740014
NESTED LOOPS 520332
INLIST ITERATOR
TABLE ACCESS BY INDEX ROWID S_QUOTE_ITEM 157132
INDEX RANGE SCAN S_QUOTE_ITEM_IDX8 8917
TABLE ACCESS BY INDEX ROWID S_DOC_QUOTE 1
INDEX UNIQUE SCAN S_DOC_QUOTE_P1 1
TABLE ACCESS BY INDEX ROWID S_ADDR_PER 1
INDEX UNIQUE SCAN S_ADDR_PER_P1 1
TABLE ACCESS BY INDEX ROWID CX_METER_INFO 1
INDEX UNIQUE SCAN CX_METER_INFO_P1 1
TABLE ACCESS BY INDEX ROWID S_CONTACT 1
INDEX UNIQUE SCAN S_CONTACT_P1 1
If I change the following where clause however:
and d.created between add_months(trunc(sysdate,'MM'), -1) and sysdate
To a static value, such as:
and d.created between to_date('20110101','yyyymmdd') and sysdate
the execution plan becomes:
SELECT STATEMENT 5
NESTED LOOPS 5
NESTED LOOPS OUTER 4
NESTED LOOPS OUTER 3
NESTED LOOPS 2
TABLE ACCESS BY INDEX ROWID S_DOC_QUOTE 1
INDEX RANGE SCAN S_DOC_QUOTE_IDX1 3
INLIST ITERATOR
TABLE ACCESS BY INDEX ROWID S_QUOTE_ITEM 1
INDEX RANGE SCAN S_QUOTE_ITEM_IDX4 4
TABLE ACCESS BY INDEX ROWID S_ADDR_PER 1
INDEX UNIQUE SCAN S_ADDR_PER_P1 1
TABLE ACCESS BY INDEX ROWID CX_METER_INFO 1
INDEX UNIQUE SCAN CX_METER_INFO_P1 1
TABLE ACCESS BY INDEX ROWID S_CONTACT 1
INDEX UNIQUE SCAN S_CONTACT_P1 1
which begins to return rows almost instantly.
So far, I've tried replacing the dynamic date condition with bind variables, as well as using a subquery which selects a dynamic date from the dual table. Neither of these methods have helped improve performance so far.
Because I'm relatively new to PL/SQL, I'm unable to understand the reasons for such substantial differences in the execution plans.
I'm also trying to run the query as a pass-through from SAS, but for the purposes of testing the execution speed I've been using SQL*Plus.
EDIT:
For clarification, I've already tried using bind variables as follows:
var start_date varchar2(8);
exec :start_date := to_char(add_months(trunc(sysdate,'MM'), -1),'yyyymmdd')
With the following where clause:
and d.created between to_date(:start_date,'yyyymmdd') and sysdate
which returns an execution cost of 1179377.
I would also like to avoid bind variables if possible as I don't believe I can reference them from a SAS pass-through query (although I may be wrong).
I doubt that the problem here has much to do with the execution time of the ADD_MONTHS function. You've already shown that there is a significant difference in the execution plan when you use a hardcoded minimum date. Big changes in execution plans generally have much more impact on run time than function call overhead is likely to, although potentially different execution plans can mean that the function is called many more times. Either way the root problem to look at is why you aren't getting the execution plan you want.
The good execution plan starts off with a range scan on S_DOC_QUOTE_IDX1. Given the nature of the change to the query, I assume this is an index on the CREATED column. Often the optimizer will not choose to use an index on a date column when the filter condition is based on SYSDATE. Because it is not evaluated until execution time, after the execution plan has been determined, the parser cannot make a good estimate of the selectivity of the date filter condition. When you use a hardcoded start date instead, the parser can use that information to determine selectivity, and makes a better choice about the use of the index.
I would have suggested bind variables as well, but I think because you are on 8i the optimizer can't peek at bind values, so this leaves it just as much in the dark as before. On a later Oracle version I would expect that the bind solution would be effective.
However, this is a good case where using literal substitution is probably more appropriate than using a bind variable, since (a) the start date value is not user-specified, and (b) it will remain constant for the whole month, so you won't be parsing lots of slightly different queries.
So my suggestion is to write some code to determine a static value for the start date and concatenate it directly into the query string before parsing & execution.
First of all, the reason you are getting different execution time is not because Oracle executes the date function a lot. The execution of this SQL function, even if it is done for each and every row (it probably is not by the way), only takes a negligible amount of time compared to the time it takes to actually retrieve the rows from disk/memory.
You are getting completely different execution times because, as you have noticed, Oracle chooses a different access path. Choosing one access path over another can lead to orders of magnitudes of difference in execution time. The real question therefore, is not "why does add_months takes time?" but:
Why does Oracle choose this particular unefficient path while there is a more efficient one?
To answer this question, one must understand how the optimizer works. The optimizer chooses a particular access path by estimating the cost of several access paths (all of them if there are only a few tables) and choosing the execution plan that is expected to be the most efficient. The algorithm to determine the cost of an execution plan has rules and it makes its estimation based on statistics gathered from your data.
As all estimation algorithms, it makes assumptions about your data, such as the general distribution based on min/max value of columns, cardinality, and the physical distribution of the values in the segment (clustering factor).
How this applies to your particular query
In your case, the optimizer has to make an estimation about the selectivity of the different filter clauses. In the first query the filter is between two variables (add_months(trunc(sysdate,'MM'), -1) and sysdate) while in the other case the filter is between a constant and a variable.
They look the same to you because you have substituted the variable by its value, but to the optimizer the cases are very different: the optimizer (at least in 8i) only computes an execution plan once for a particular query. Once the access path has been determined, all further execution will get the same execution plan. It can not, therefore, replace a variable by its value because the value may change in the future, and the access plan must work for all possible values.
Since the second query uses variables, the optimizer cannot determine precisely the selectivity of the first query, so the optimizer makes a guess, and that results in your case in a bad plan.
What can you do when the optimizer doesn't choose the correct plan
As mentionned above, the optimizer sometimes makes bad guesses, which result in suboptimal access path. Even if it happens rarely, this can be disastrous (hours instead of seconds). Here are some actions you could try:
Make sure your stats are up-to-date. The last_analyzed column on ALL_TABLES and ALL_INDEXES will tell you when was the last time the stats were collected on these objects. Good reliable stats lead to more accurate guesses, leading (hopefully) to better execution plan.
Learn about the different options to collect statistics (dbms_stats package)
Rewrite your query to make use of constants, when it makes sense, so that the optimizer will make more reliable guesses.
Sometimes two logically identical queries will result in different execution plans, because the optimizer will not compute the same access paths (of all possible paths).
There are some tricks you can use to force the optimizer to perform some join before others, for example:
Use rownum to materialize a subquery (it may take more temporary space, but will allow you to force the optimizer through a specific step).
Use hints, although most of the time I would only turn to hints when all else fails. In particular, I sometimes use the LEADING hint to force the optimizer to start with a specific table (or couple of table).
Last of all, you will probably find that the more recent releases have a generally more reliable optimizer. 8i is 12+ years old and it may be time for an upgrade :)
This is really an interesting topic. The oracle optimizer is ever-changing (between releases) it improves over time, even if new quirks are sometimes introduced as defects get corrected. If you want to learn more, I would suggest Jonathan Lewis' Cost Based Oracle: Fundamentals
That's because the function is run for every comparison.
sometimes it's faster to put it in a select from dual:
and d.created
between (select add_months(trunc(sysdate,'MM'), -1) from dual)
and sysdate
otherwise, you could also join the date like this:
select
-- stuff --
from
s_doc_quote d,
s_quote_item i,
s_contact c,
s_addr_per a,
cx_meter_info m,
(select add_months(trunc(sysdate,'MM'), -1) as startdate from dual) sd
where
d.row_id = i.sd_id
and d.con_per_id = c.row_id
and i.ship_per_addr_id = a.row_id(+)
and i.x_meter_info_id = m.row_id(+)
and d.x_move_type in ('Move In','Move Out','Move Out / Move In')
and i.prod_id in ('1-QH6','1-QH8')
and d.created between sd.startdate and sysdate
Last option and actually the best chance of improved performance: Add a date parameter to the query like this:
and d.created between :startdate and sysdate
[edit]
I'm sorry, I see you already tried options like these. Still odd. If the constant value works, the bind parameter should work as well, as long as you keep the add_months function outside the query.
This is SQL. You may want to use PL/SQL and save the calculation add_months(trunc(sysdate,'MM'), -1) into a variable first ,then bind that.
Also, I've seen SAS calcs take a long while due to pulling data across the network and doing additional work on each row it processes. Depending on your environment, you may consider creating a temp table to store the results of these joins first, then hitting the temp table (try a CTAS).

Resources