ORDER BY subquery and ROWNUM goes against relational philosophy? - oracle

Oracle 's ROWNUM is applied before ORDER BY. In order to put ROWNUM according to a sorted column, the following subquery is proposed in all documentations and texts.
select *
from (
select *
from table
order by price
)
where rownum <= 7
That bugs me. As I understand, table input into FROM is relational, hence no order is stored, meaning the order in the subquery is not respected when seen by FROM.
I cannot remember the exact scenarios but this fact of "ORDER BY has no effect in the outer query" I have read more than once. Examples are in-line subqueries, subquery for INSERT, ORDER BY of PARTITION clause, etc. For example in
OVER (PARTITION BY name ORDER BY salary)
the salary order will not be respected in outer query, and if we want salary to be sorted at outer query output, another ORDER BY need to be added in the outer query.
Some insights from everyone on why the relational property is not respected here and order is stored in the subquery ?

The ORDER BY in this context is in effect Oracle's proprietary syntax for generating an "ordered" row number on a (logically) unordered set of rows. This is a poorly designed feature in my opinion but the equivalent ISO standard SQL ROW_NUMBER() function (also valid in Oracle) may make it clearer what is happening:
select *
from (
select ROW_NUMBER() OVER (ORDER BY price) rn, *
from table
) t
where rn <= 7;
In this example the ORDER BY goes where it more logically belongs: as part of the specification of a derived row number attribute. This is more powerful than Oracle's version because you can specify several different orderings defining different row numbers in the same result. The actual ordering of rows returned by this query is undefined. I believe that's also true in your Oracle-specific version of the query because no guarantee of ordering is made when you use ORDER BY in that way.
It's worth remembering that Oracle is not a Relational DBMS. In common with other SQL DBMSs Oracle departs from the relational model in some fundamental ways. Features like implicit ordering and DISTINCT exist in the product precisely because of the non-relational nature of the SQL model of data and the consequent need to work around keyless tables with duplicate rows.

Not surprisingly really, Oracle treats this as a bit of a special case. You can see that from the execution plan. With the naive (incorrect/indeterminate) version of the limit that crops up sometimes, you get SORT ORDER BY and COUNT STOPKEY operations:
select *
from my_table
where rownum <= 7
order by price;
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 3 (34)| 00:00:01 |
| 1 | SORT ORDER BY | | 1 | 13 | 3 (34)| 00:00:01 |
|* 2 | COUNT STOPKEY | | | | | |
| 3 | TABLE ACCESS FULL| MY_TABLE | 1 | 13 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(ROWNUM<=7)
If you just use an ordered subquery, with no limit, you only get the SORT ORDER BY operation:
select *
from (
select *
from my_table
order by price
);
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 3 (34)| 00:00:01 |
| 1 | SORT ORDER BY | | 1 | 13 | 3 (34)| 00:00:01 |
| 2 | TABLE ACCESS FULL| MY_TABLE | 1 | 13 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------
With the usual subquery/ROWNUM construct you get something different,
select *
from (
select *
from my_table
order by price
)
where rownum <= 7;
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 3 (34)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 1 | 13 | 3 (34)| 00:00:01 |
|* 3 | SORT ORDER BY STOPKEY| | 1 | 13 | 3 (34)| 00:00:01 |
| 4 | TABLE ACCESS FULL | MY_TABLE | 1 | 13 | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=7)
3 - filter(ROWNUM<=7)
The COUNT STOPKEY operation is still there for the outer query, but the inner query (inline view, or derived table) now has a SORT ORDER BY STOPKEY instead of the simple SORT ORDER BY. This is all hidden away in the internals so I'm speculating, but it looks like the stop key - i.e. the row number limit - is being pushed into the subquery processing, so in effect the subquery may only end up with seven rows anyway - though the plan's ROWS value doesn't reflect that (but then you get the same plan with a different limit), and it still feels the need to apply the COUNT STOPKEY operation separately.
Tom Kyte covered similar ground in an Oracle Magazine article, when talking about "Top- N Query Processing with ROWNUM" (emphasis added):
There are two ways to approach this:
- Have the client application run that query and fetch just the first N rows.
- Use that query as an inline view, and use ROWNUM to limit the results, as in SELECT * FROM ( your_query_here ) WHERE ROWNUM <= N.
The second approach is by far superior to the first, for two reasons. The lesser of the two reasons is that it requires less work by the client, because the database takes care of limiting the result set. The more important reason is the special processing the database can do to give you just the top N rows. Using the top- N query means that you have given the database extra information. You have told it, "I'm interested only in getting N rows; I'll never consider the rest." Now, that doesn't sound too earth-shattering until you think about sorting—how sorts work and what the server would need to do.
... and then goes on to outline what it's actually doing, rather more authoritatively than I can.
Interestingly I don't think the order of the final result set is actually guaranteed; it always seems to work, but arguably you should still have an ORDER BY on the outer query too to make it complete. It looks like the order isn't really stored in the subquery, it just happens to be produced like that. (I very much doubt that will ever change as it would break too many things; this ends up looking similar to a table collection expression which also always seems to retain its ordering - breaking that would stop dbms_xplan working though. I'm sure there are other examples.)
Just for comparison, this is what the ROW_NUMBER() equivalent does:
select *
from (
select ROW_NUMBER() OVER (ORDER BY price) rn, my_table.*
from my_table
) t
where rn <= 7;
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 52 | 4 (25)| 00:00:01 |
|* 1 | VIEW | | 2 | 52 | 4 (25)| 00:00:01 |
|* 2 | WINDOW SORT PUSHED RANK| | 2 | 26 | 4 (25)| 00:00:01 |
| 3 | TABLE ACCESS FULL | MY_TABLE | 2 | 26 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RN"<=7)
2 - filter(ROW_NUMBER() OVER ( ORDER BY "PRICE")<=7)

Adding to sqlvogel's good answer :
"As I understand, table input into FROM is relational"
No, table input into FROM is not relational. It is not relational because "table input" are tables and tables are not relations. The myriads of quirks and oddities in SQL eventually all boil down to that simple fact : the core building brick in SQL is the table, and a table is not a relation. To sum up the differences :
Tables can contain duplicate rows, relations cannot. (As a consequence, SQL offers bag algebra, not relational algebra. As another consequence, it is as good as impossible for SQL to even define equality comparison for its most basic building brick !!! How would you compare tables for equality given that you might have to deal with duplicate rows ?)
Tables can contain unnamed columns, relations cannot. SELECT X+Y FROM ... As a consequence, SQL is forced into "column identity by ordinal position", and as a consequence of that, you get all sorts of quirks, e.g. in SELECT A,B FROM ... UNION SELECT B,A FROM ...
Tables can contain duplicate column names, relations cannot. A.ID and B.ID in a table are not distinct column names. The part before the dot is not part of the name, it is a "scope identifier", and that scope identifier "disappears" once you're "outside the SELECT" it appears/is introduced in. You can verify this with a nested SELECT : SELECT A.ID FROM (SELECT A.ID, B.ID FROM ...). It won't work (unless your particular implementation departs from the standard in order to make it work).
Various SQL constructs leave people with the impression that tables do have an ordering to rows. The ORDER BY clause, obviously, but also the GROUP BY clause (which can be made to work only by introducing rather dodgy concepts of "intermediate tables with rows grouped together"). Relations simply are not like that.
Tables can contain NULLs, relations cannot. This one has been beaten to death.
There should be some more, but I don't remember them off the tip of the hat.

Related

Confusion regarding to_char and to_number

First of all, I am aware about basics.
select to_number('A231') from dual; --this will not work but
select to_char('123') from dual;-- this will work
select to_number('123') from dual;-- this will also work
Actually in my package, we have 2 tables A(X number) and B(Y varchar) There are many columns but we are worried about only X and Y. X contains values only numeric like 123,456 etc but Y contains some string and some number for eg '123','HR123','Hello'. We have to join these 2 tables. its legacy application so we are not able to change tables and columns.
Till this time below condition was working properly
to_char(A.x)=B.y;
But since there is index on Y, performance team suggested us to do
A.x=to_number(B.y); it is running in dev env.
My question is, in any circumstances will this query give error? if it picks '123' definitely it will give 123. but if it picks 'AB123' then it will fail. can it fail? can it pick 'AB123' even when it is getting joined with other table.
can it fail?
Yes. It must put every row through TO_NUMBER before it can check whether or not it meets the filter condition. Therefore, if you have any one row where it will fail then it will always fail.
From Oracle 12.2 (since you tagged Oracle 12) you can use:
SELECT *
FROM A
INNER JOIN B
ON (A.x = TO_NUMBER(B.y DEFAULT NULL ON CONVERSION ERROR))
Alternatively, put an index on TO_CHAR(A.x) and use your original query:
SELECT *
FROM A
INNER JOIN B
ON (TO_CHAR(A.x) = B.y)
Also note: Having an index on B.y does not mean that the index will be used. If you are filtering on TO_NUMBER(B.y) (with or without the default on conversion error) then you would need a function-based index on the function TO_NUMBER(B.Y) that you are using. You should profile the queries and check the explain plans to see whether there is any improvement or change in use of indexes.
Never convert a VARCHAR2 column that can contain non-mumeric strings to_number.
This can partially work, but will eventuelly definitively fail.
Small Example
create table a as
select rownum X from dual connect by level <= 10;
create table b as
select to_char(rownum) Y from dual connect by level <= 10
union all
select 'Hello' from dual;
This could work (as you limit the rows, so that the conversion works; if you are lucky and Oracle chooses the right execution plan; which is probable, but not guarantied;)
select *
from a
join b on A.x=to_number(B.y)
where B.y = '1';
But this will fail
select *
from a
join b on A.x=to_number(B.y)
ORA-01722: invalid number
Performance
But since there is index on Y, performance team suggested us to do A.x=to_number(B.y);
You should chalange the team, as if you use a function on a column (to_number(B.y)) index can't be used.
On the contrary, your original query can perfectly use the following indexes:
create index b_y on b(y);
create index a_x on a(x);
Query
select *
from a
join b on to_char(A.x)=B.y
where A.x = 1;
Execution Plan
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 1 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1 | 5 | 1 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN| A_X | 1 | 3 | 1 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN| B_Y | 1 | 2 | 0 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"."X"=1)
3 - access("B"."Y"=TO_CHAR("A"."X"))

Oracle linguistic index not used when SQL contains parameter with LIKE

My schema (simplified):
CREATE TABLE LOC
(
LOC_ID NUMBER(15,0) NOT NULL,
LOC_REF_NO VARCHAR2(100 CHAR) NOT NULL
)
/
CREATE INDEX LOC_REF_NO_IDX ON LOC
(
NLSSORT("LOC_REF_NO",'nls_sort=''BINARY_AI''') ASC
)
/
My query (in SQL*Plus):
ALTER SESSION SET NLS_COMP=LINGUISTIC NLS_SORT=BINARY_AI
/
VAR LOC_REF_NO VARCHAR2(50)
BEGIN
:LOC_REF_NO := 'SPDJ1501270';
END;
/
-- Causes full table scan (i.e, does not use LOC_REF_NO_IDX)
SELECT * FROM LOC WHERE LOC_REF_NO LIKE :LOC_REF_NO||'%';
-- Causes index scan (i.e. uses LOC_REF_NO_IDX)
SELECT * FROM LOC WHERE LOC_REF_NO LIKE 'SPDJ1501270%';
That the index is not used has been confirmed by doing an AUTOTRACE (EXPLAIN PLAN) and the SQL just runs slower. Tried a number of thing without success. Anyone got any idea what is going on? I am using Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit.
Update 1:
Note that the index is used when I use an equals with a parameter:
SELECT * FROM LOC WHERE LOC_REF_NO = :LOC_REF_NO;
Explain Plan:
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 93 | 5 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| LOC | 1 | 93 | 5 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | LOC_REF_NO_IDX | 1 | | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(NLSSORT("LOC_REF_NO",'nls_sort=''BINARY_AI''')=NLSSORT(:LOC_REF_NO,'nls_
sort=''BINARY_AI'''))
Whereas
SELECT * FROM LOC WHERE LOC_REF_NO LIKE :LOC_REF_NO||'%';
Explain Plan:
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 50068 | 3471K| 5724 (1)| 00:01:09 |
|* 1 | TABLE ACCESS FULL| LOC | 50068 | 3471K| 5724 (1)| 00:01:09 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("LOC_REF_NO" LIKE :LOC_REF_NO||'%')
Dumbfounded!
Update 2:
The reason we are using NLSSORT on an index is to make Oracle queries case insensitive and this was the general recommendation. Previously we use functional indexes with NLS_UPPER. The strange thing that is that the index is always used, parameter or not, as shown below.
So if table is as above, LOC_REF_NO_IDX index removed and this one added:
CREATE INDEX LOC_REF_NO_CI_IDX ON LOC
(
NLS_UPPER(LOC_REF_NO) ASC
)
/
The all of the following use the index:
ALTER SESSION SET NLS_COMP=BINARY NLS_SORT=BINARY;
SELECT * FROM LOC WHERE NLS_UPPER(LOC_REF_NO) LIKE :LOC_REF_NO||'%';
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 50068 | 5329K| 5700 (1)| 00:01:09 |
| 1 | TABLE ACCESS BY INDEX ROWID| LOC | 50068 | 5329K| 5700 (1)| 00:01:09 |
|* 2 | INDEX RANGE SCAN | LOC_REF_NO_CI_IDX | 9012 | | 43 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(NLS_UPPER("LOC_REF_NO") LIKE :LOC_REF_NO||'%')
filter(NLS_UPPER("LOC_REF_NO") LIKE :LOC_REF_NO||'%')
So for some reason when using LIKE with a parameter on a linguistic index, the Oracle optimizer is deciding not to use the index.
According to Oracle support note 1451804.1 this is a known limitation of using LIKE with NLSSORT-based indexes.
If you look at the execution plan for your fixed-value query you see something like:
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(NLSSORT("LOC_REF_NO",'nls_sort=''BINARY_AI''')>=HEXTORAW('7370646A313530
3132373000') AND NLSSORT("LOC_REF_NO",'nls_sort=''BINARY_AI''')<HEXTORAW('7370646A313
5303132373100') )
Those raw values evaluate to spdj1501270 and spdj1501271; those are derived from your constant string, and any values matching your like condition will be in that range. That parse-time transformation has to be based on a constant value, and doesn't work with a bind variable or an expression, presumably because it's evaluated too late.
See the note for more information, but there doesn't seem to be a workaround unfortunately. You might have to go back to your NLS_UPPER approach.
Previous explanation applies generally but not in this specific case, but kept for reference...
In general, with the fixed value the optimiser can estimate how selective your query is when it parses it, because it can know roughly what proportion of index values match that value. It may or may not use the index, depending on the actual value you use.
With the bind variable it comes up with a plan via bind variable peeking:
In bind variable peeking (also known as bind peeking), the optimizer looks at the value in a bind variable when the database performs a hard parse of a statement.
When a query uses literals, the optimizer can use the literal values to find the best plan. However, when a query uses bind variables, the optimizer must select the best plan without the presence of literals in the SQL text. This task can be extremely difficult. By peeking at bind values the optimizer can determine the selectivity of a WHERE clause condition as if literals had been used, thereby improving the plan.
It uses the statistics it has gathered to decide if any particular value is more likely than others. That probably isn't going to be the case here, especially with the like. It's falling back to a full table scan becuse it can't determine when it does the hard parse that the index will be more selective most of the time. Imagine, for example, that the parse decided to use the index, but then you supplied a bind value of just S, or even null - using the index would then do much more work than a full table scan.
Also worth noting:
When choosing a plan, the optimizer only peeks at the bind value during the hard parse. This plan may not be optimal for all possible values.
Adaptive cursor sharing can mitigate this, but this query may not qualify:
The criteria used by the optimizer to decide whether a cursor is bind-sensitive include the following:
The optimizer has peeked at the bind values to generate selectivity estimates.
A histogram exists on the column containing the bind value.
When I mocked this up with a small-ish amount of limited data, v$sql reported both is_bind_sensitive and is_bind_aware as 'N'.

Optimization: Use “between” or “>= and <=”

Let’s take the following question, for example. Is there any difference between using:
SELECT * FROM foo_table
WHERE column BETWEEN n and m
and
SELECT * FROM foo_table
WHER column>=n and column<=m?
Looks like a simple one,Oracle’s documentation is dead clear on this:
[Between] means “greater than or equal to low value and less than or equal to high value.”
They are the same from a semantic point of view. But, SQL is a declarative language. So, you wouldn’t expect same execution plan with two semantically identical statements, right?
My Questions are:
An optimizer might watch for a different code path for equal or less than versus less than on a numeric index, right?
What is better and faster way for Numeric types?
Oracle's optimizer converts between to >= and <=, which you can see from the execution plan. For example, this is from 11gR2:
explain plan for
select * from dual where dummy between 'W' and 'Y';
select * from table(dbms_xplan.display);
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| DUAL | 1 | 2 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("DUMMY">='W' AND "DUMMY"<='Y')
Notice the filter being used. So it makes no difference which you use, except perhaps in edge cases like the article this seems to have come from (thanks Shankar). Worrying about it for mainstream cases probably isn't going to be on much benefit.

Generalising Oracle Static Cursors

I am creating an OLAP-like package in Oracle where you call a main, controlling function that assembles its returning output table by making numerous left joins. These joined tables are defined in 'slave' functions within the package, which return specific subsets using static cursors, parameterised by the function's arguments. The thing is, these cursors are all very similar.
Is there a way, beyond generating dynamic queries and using them in a ref cursor, that I can generalise these. Every time I add a function, I get this weird feeling, as a developer, that this isn't particularly elegant!
Pseduocode
somePackage
function go(param)
return select myRows.id,
stats1.value,
stats2.value
from myRows
left join table(somePackage.stats1(param)) stats1
on stats1.id = myRows.id
left join table(somePackage.stats2(param)) stats2
on stats2.id = myRows.id
function stats1(param)
return [RESULTS OF SOME QUERY]
function stats2(param)
return [RESULTS OF A RELATED QUERY]
The stats queries all have the same structure:
First they aggregate the data in a useful way
Then they split this data into logical sections, based on criteria, and aggregate again (e.g., by department, by region, etc.) then union the results
Then they return the results, cast into the relevant object type, so I can easily do a bulk collect
Something like:
cursor myCursor is
with fullData as (
[AGGREGATE DATA]
),
fullStats as (
[AGGREGATE FULLDATA BY TOWN]
union all
[AGGREGATE FULLDATA BY REGION]
union all
[AGGREGATE FULLDATA BY COUNTRY]
)
select myObjectType(fullStats.*)
from fullStats;
...
open myCursor;
fetch myCursor bulk collect into output limit 1000;
close myCursor;
return output;
Filter operations can help build dynamic queries with static SQL. Especially when the column list is static.
You may have already considered this approach but discarded it for performance reasons. "Why execute every SQL block if we only need the results
from one of them?" You're in luck, the optimizer already does this for you with a FILTER operation.
Example Query
First create a function that waits 5 seconds every time it is run. It will help find which query blocks were executed.
create or replace function slow_function return number is begin
dbms_lock.sleep(5);
return 1;
end;
/
This static query is controlled by bind variables. There are three query blocks but the entire query runs in 5 seconds instead of 15.
declare
v_sum number;
v_query1 number := 1;
v_query2 number := 0;
v_query3 number := 0;
begin
select sum(total)
into v_sum
from
(
select total from (select slow_function() total from dual) where v_query1 = 1
union all
select total from (select slow_function() total from dual) where v_query2 = 1
union all
select total from (select slow_function() total from dual) where v_query3 = 1
);
end;
/
Execution Plan
This performance is not the result of good luck; it's not simply Oracle randomly executing one predicate before another. Oracle analyzes the bind variables before
run-time and does not even execute the irrelevant query blocks. That's what the FILTER operation below is doing. (Which is a poor name, many people generally
refer to all predicates as "filters". But only some of them result in a FILTER operation.)
select * from table(dbms_xplan.display_cursor(sql_id => '0cfqc6a70kzmt'));
SQL_ID 0cfqc6a70kzmt, child number 0
-------------------------------------
SELECT SUM(TOTAL) FROM ( SELECT TOTAL FROM (SELECT SLOW_FUNCTION()
TOTAL FROM DUAL) WHERE :B1 = 1 UNION ALL SELECT TOTAL FROM (SELECT
SLOW_FUNCTION() TOTAL FROM DUAL) WHERE :B2 = 1 UNION ALL SELECT TOTAL
FROM (SELECT SLOW_FUNCTION() TOTAL FROM DUAL) WHERE :B3 = 1 )
Plan hash value: 926033116
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 6 (100)| |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
| 2 | VIEW | | 3 | 39 | 6 (0)| 00:00:01 |
| 3 | UNION-ALL | | | | | |
|* 4 | FILTER | | | | | |
| 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
|* 6 | FILTER | | | | | |
| 7 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
|* 8 | FILTER | | | | | |
| 9 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter(:B1=1)
6 - filter(:B2=1)
8 - filter(:B3=1)
Issues
The FILTER operation is poorly documented. I can't explain in detail when it does or does not work, and exactly what affects it has on other parts of the query. For example, in the explain plan the Rows estimate is 3, but at run time Oracle should easily be able to estimate the cardinality is 1. Apparently the execution plan is not that dynamic, that poor cardinality estimate may cause later issues. Also, I've seen some weird cases where static expressions are not appropriately filtered. But if a query uses a simple equality predicate it should be fine.
This approach allows you remove all dynamic SQL and replace it with a large static SQL statement. Which has some advantages; dynamic SQL is often "ugly" and difficult to debug. But people only familiar with procedural programming tend to think of a single SQL statement as one huge God-function, a bad practice. They won't appreciate that the UNION ALLs create independent blocks of SQL
Dynamic SQL is Still Probably Better
In general I would recommend against this approach. What you have is good because it looks good. The main problem with dynamic SQL is that people don't treat it like real code; it's not commented or formatted and ends up looking like a horrible mess that nobody can understand. If you are able to spend the extra time to generate clean code then you should stick with that.

Adding an Index degraded execution time

I have a table like this:
myTable (id, group_id, run_date, table2_id, description)
I also have a index like this:
index myTable_grp_i on myTable (group_id)
I used to run a query like this:
select * from myTable t where t.group_id=3 and t.run_date='20120512';
and it worked fine and everyone was happy.
Until I added another index:
index myTable_tab2_i on myTable (table2_id)
My life became miserable... it's taking almost as 5 times longer to run !!!
execution plan looks the same (with or without the new index):
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 220 | 17019
|* 1 | TABLE ACCESS BY INDEX ROWID| MYTABLE | 1 | 220 | 17019
|* 2 | INDEX RANGE SCAN | MYTABLE_GRP_I | 17056 | | 61
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T"."RUN_DATE"='20120512')
2 - access("T"."GROUP_ID"=3)
I have almost no hair left on my head, why should another index which is not used, on a column which is not in the where clause make a difference ...
I will update the things I checked:
a. I removed the new index and it run faster
b. I added the new index in 2 more different environments and the same thing happen
c. I changed MYTABLE_GRP_I to be on columns run_date and group_id - this made it run fast as a lightning !!
But still why does it happen ?

Resources