Confusion regarding to_char and to_number - oracle

First of all, I am aware about basics.
select to_number('A231') from dual; --this will not work but
select to_char('123') from dual;-- this will work
select to_number('123') from dual;-- this will also work
Actually in my package, we have 2 tables A(X number) and B(Y varchar) There are many columns but we are worried about only X and Y. X contains values only numeric like 123,456 etc but Y contains some string and some number for eg '123','HR123','Hello'. We have to join these 2 tables. its legacy application so we are not able to change tables and columns.
Till this time below condition was working properly
to_char(A.x)=B.y;
But since there is index on Y, performance team suggested us to do
A.x=to_number(B.y); it is running in dev env.
My question is, in any circumstances will this query give error? if it picks '123' definitely it will give 123. but if it picks 'AB123' then it will fail. can it fail? can it pick 'AB123' even when it is getting joined with other table.

can it fail?
Yes. It must put every row through TO_NUMBER before it can check whether or not it meets the filter condition. Therefore, if you have any one row where it will fail then it will always fail.
From Oracle 12.2 (since you tagged Oracle 12) you can use:
SELECT *
FROM A
INNER JOIN B
ON (A.x = TO_NUMBER(B.y DEFAULT NULL ON CONVERSION ERROR))
Alternatively, put an index on TO_CHAR(A.x) and use your original query:
SELECT *
FROM A
INNER JOIN B
ON (TO_CHAR(A.x) = B.y)
Also note: Having an index on B.y does not mean that the index will be used. If you are filtering on TO_NUMBER(B.y) (with or without the default on conversion error) then you would need a function-based index on the function TO_NUMBER(B.Y) that you are using. You should profile the queries and check the explain plans to see whether there is any improvement or change in use of indexes.

Never convert a VARCHAR2 column that can contain non-mumeric strings to_number.
This can partially work, but will eventuelly definitively fail.
Small Example
create table a as
select rownum X from dual connect by level <= 10;
create table b as
select to_char(rownum) Y from dual connect by level <= 10
union all
select 'Hello' from dual;
This could work (as you limit the rows, so that the conversion works; if you are lucky and Oracle chooses the right execution plan; which is probable, but not guarantied;)
select *
from a
join b on A.x=to_number(B.y)
where B.y = '1';
But this will fail
select *
from a
join b on A.x=to_number(B.y)
ORA-01722: invalid number
Performance
But since there is index on Y, performance team suggested us to do A.x=to_number(B.y);
You should chalange the team, as if you use a function on a column (to_number(B.y)) index can't be used.
On the contrary, your original query can perfectly use the following indexes:
create index b_y on b(y);
create index a_x on a(x);
Query
select *
from a
join b on to_char(A.x)=B.y
where A.x = 1;
Execution Plan
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 5 | 1 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1 | 5 | 1 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN| A_X | 1 | 3 | 1 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN| B_Y | 1 | 2 | 0 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"."X"=1)
3 - access("B"."Y"=TO_CHAR("A"."X"))

Related

How to get rid of FULL TABLE SCAN in oracle

I have one query and it is giving me full table scan while doing explain plan , so will you tell me how to get rid of it.
output:
|* 9 | INDEX UNIQUE SCAN | GL_PERIODS_U1 | 1 | | | 1 (0)|
|* 10 | TABLE ACCESS FULL | GL_PERIODS | 12 | 372 | | 6 (0)|
|* 11 | TABLE ACCESS BY INDEX ROWID | GL_JE_HEADERS | 1 | 37 | | 670 (0)|
|* 12 | INDEX RANGE SCAN | GL_JE_HEADERS_N2 | 3096 | | | 11 (0)|
|* 13 | TABLE ACCESS BY INDEX ROWID | GL_JE_BATCHES | 1 | 8 | | 2 (0)|
|* 14 | INDEX UNIQUE SCAN | GL_JE_BATCHES_U1 | 1 | | | 1 (0)|
|* 15 | INDEX RANGE SCAN | GL_JE_LINES_U1 | 746 | | | 4 (0)|
| 16 | TABLE ACCESS FULL | GL_CODE_COMBINATIONS | 1851K| 30M| | 13023 (1)|
My query :
explain plan for
select cc.segment1,
cc.segment2,
h.currency_code,
SUM(NVL(l.accounted_dr,0) - NVL(l.accounted_cr,0))
from gl_code_combinations cc
,gl_je_lines l
,gl_je_headers h
,gl_je_batches b
,gl_periods p1
,gl_periods p2
where cc.code_combination_id = l.code_combination_id
AND b.je_batch_id = h.je_batch_id
AND b.status = 'P'
AND l.je_header_id = h.je_header_id
AND h.je_category = 'Revaluation'
AND h.period_name = p1.period_name
AND p1.period_set_name = 'Equant Master'
AND p2.period_name = 'SEP-16'
AND p2.period_set_name = 'Equant Master'
AND p1.start_date <= p2.end_date
AND h.set_of_books_id = '1429'
GROUP BY cc.segment1,
cc.segment2,
h.currency_code
please suggest
I see you are using the Oracle e-Business Suite data model. In that model, GL_PERIODS, being the table of accounting periods (usually weeks or months), is usually fairly small. Further, you are telling it you want every period prior to September 2016, which is likely to be almost all the periods in your "Equant Master" period set. Depending on how many other period sets you have defined, your full table scan may very well be the optimal (fastest running) plan.
As others have correctly pointed out, full table scans aren't necessarily worse or slower than other access paths.
To determine if your FTS really is a problem, you can use DBMS_XPLAN to get timings of how long each step in your plan is taking. Like this:
First, tell Oracle to keep track of plan-step-level statistics for your session
alter session set statistics_level = ALL;
Make sure you turn of DBMS_OUTPUT / server output
Run your query to completion (i.e., scroll to the bottom of the result set)
Finally, run this query:
SELECT *
FROM TABLE (DBMS_XPLAN.display_cursor (null, null,
'ALLSTATS LAST'));
The output will tell you exactly why your query is taking so long (if it is taking long). It is much more accurate than just picking out all the full table scans in your explain plan.
First thing, why do you want to avoid full table scan? All full table scans are not bad.
You are joining on the same table cc.code_combination_id = l.code_combination_id. I don't think there is a away to avoid full table scan on these type of joins.
To understand this, I created test tables and data.
create table I1(n number primary key, v varchar2(10));
create table I2(n number primary key, v varchar2(10));
and a map table
create table MAP(n number primary key, i1 number referencing I1(n),
i2 number referencing I2(n));
I created index on map table.
create index map_index_i1 on map(i1);
create index map_index_i2 on map(i2);
Here is the sample data that I inserted.
SQL> select * from i1;
N V
1 ONE
2 TWO
5 FIVE
SQL> select * from i2;
N V
3 THREE
4 FOUR
5 FIVE
SQL> select * from map;
N I1 I2
1 1 3
2 1 4
5 5 5
I do gathered the statistics. Then, I executed the query which uses I1 and I2 from map table.
explain plan for
select map.n,i1.v
from i1,map
where map.i2 = map.i1
and i1.n=5
Remember, we have index on I1 and I2 of map table. I thought the optimizer might use the index, but unfortunately it didn't.
Full table scan
Because the condition map.i2 = map.i1 means compare every record of map table's I2 column with I1.
Next, I used one of the indexed columns in the where condition and now it picked the index.
explain plan for
select map.n,i1.v
from i1,map
where map.i2 = map.i1
and i1.n=5
and map.i1=5
Index scan
Have a look at ASK Tom's pages for full table scans. Unfortunately, I couldn't paste the source an link since I have less than 10 reputation !!

ORDER BY subquery and ROWNUM goes against relational philosophy?

Oracle 's ROWNUM is applied before ORDER BY. In order to put ROWNUM according to a sorted column, the following subquery is proposed in all documentations and texts.
select *
from (
select *
from table
order by price
)
where rownum <= 7
That bugs me. As I understand, table input into FROM is relational, hence no order is stored, meaning the order in the subquery is not respected when seen by FROM.
I cannot remember the exact scenarios but this fact of "ORDER BY has no effect in the outer query" I have read more than once. Examples are in-line subqueries, subquery for INSERT, ORDER BY of PARTITION clause, etc. For example in
OVER (PARTITION BY name ORDER BY salary)
the salary order will not be respected in outer query, and if we want salary to be sorted at outer query output, another ORDER BY need to be added in the outer query.
Some insights from everyone on why the relational property is not respected here and order is stored in the subquery ?
The ORDER BY in this context is in effect Oracle's proprietary syntax for generating an "ordered" row number on a (logically) unordered set of rows. This is a poorly designed feature in my opinion but the equivalent ISO standard SQL ROW_NUMBER() function (also valid in Oracle) may make it clearer what is happening:
select *
from (
select ROW_NUMBER() OVER (ORDER BY price) rn, *
from table
) t
where rn <= 7;
In this example the ORDER BY goes where it more logically belongs: as part of the specification of a derived row number attribute. This is more powerful than Oracle's version because you can specify several different orderings defining different row numbers in the same result. The actual ordering of rows returned by this query is undefined. I believe that's also true in your Oracle-specific version of the query because no guarantee of ordering is made when you use ORDER BY in that way.
It's worth remembering that Oracle is not a Relational DBMS. In common with other SQL DBMSs Oracle departs from the relational model in some fundamental ways. Features like implicit ordering and DISTINCT exist in the product precisely because of the non-relational nature of the SQL model of data and the consequent need to work around keyless tables with duplicate rows.
Not surprisingly really, Oracle treats this as a bit of a special case. You can see that from the execution plan. With the naive (incorrect/indeterminate) version of the limit that crops up sometimes, you get SORT ORDER BY and COUNT STOPKEY operations:
select *
from my_table
where rownum <= 7
order by price;
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 3 (34)| 00:00:01 |
| 1 | SORT ORDER BY | | 1 | 13 | 3 (34)| 00:00:01 |
|* 2 | COUNT STOPKEY | | | | | |
| 3 | TABLE ACCESS FULL| MY_TABLE | 1 | 13 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(ROWNUM<=7)
If you just use an ordered subquery, with no limit, you only get the SORT ORDER BY operation:
select *
from (
select *
from my_table
order by price
);
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 3 (34)| 00:00:01 |
| 1 | SORT ORDER BY | | 1 | 13 | 3 (34)| 00:00:01 |
| 2 | TABLE ACCESS FULL| MY_TABLE | 1 | 13 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------
With the usual subquery/ROWNUM construct you get something different,
select *
from (
select *
from my_table
order by price
)
where rownum <= 7;
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 3 (34)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 1 | 13 | 3 (34)| 00:00:01 |
|* 3 | SORT ORDER BY STOPKEY| | 1 | 13 | 3 (34)| 00:00:01 |
| 4 | TABLE ACCESS FULL | MY_TABLE | 1 | 13 | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=7)
3 - filter(ROWNUM<=7)
The COUNT STOPKEY operation is still there for the outer query, but the inner query (inline view, or derived table) now has a SORT ORDER BY STOPKEY instead of the simple SORT ORDER BY. This is all hidden away in the internals so I'm speculating, but it looks like the stop key - i.e. the row number limit - is being pushed into the subquery processing, so in effect the subquery may only end up with seven rows anyway - though the plan's ROWS value doesn't reflect that (but then you get the same plan with a different limit), and it still feels the need to apply the COUNT STOPKEY operation separately.
Tom Kyte covered similar ground in an Oracle Magazine article, when talking about "Top- N Query Processing with ROWNUM" (emphasis added):
There are two ways to approach this:
- Have the client application run that query and fetch just the first N rows.
- Use that query as an inline view, and use ROWNUM to limit the results, as in SELECT * FROM ( your_query_here ) WHERE ROWNUM <= N.
The second approach is by far superior to the first, for two reasons. The lesser of the two reasons is that it requires less work by the client, because the database takes care of limiting the result set. The more important reason is the special processing the database can do to give you just the top N rows. Using the top- N query means that you have given the database extra information. You have told it, "I'm interested only in getting N rows; I'll never consider the rest." Now, that doesn't sound too earth-shattering until you think about sorting—how sorts work and what the server would need to do.
... and then goes on to outline what it's actually doing, rather more authoritatively than I can.
Interestingly I don't think the order of the final result set is actually guaranteed; it always seems to work, but arguably you should still have an ORDER BY on the outer query too to make it complete. It looks like the order isn't really stored in the subquery, it just happens to be produced like that. (I very much doubt that will ever change as it would break too many things; this ends up looking similar to a table collection expression which also always seems to retain its ordering - breaking that would stop dbms_xplan working though. I'm sure there are other examples.)
Just for comparison, this is what the ROW_NUMBER() equivalent does:
select *
from (
select ROW_NUMBER() OVER (ORDER BY price) rn, my_table.*
from my_table
) t
where rn <= 7;
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 52 | 4 (25)| 00:00:01 |
|* 1 | VIEW | | 2 | 52 | 4 (25)| 00:00:01 |
|* 2 | WINDOW SORT PUSHED RANK| | 2 | 26 | 4 (25)| 00:00:01 |
| 3 | TABLE ACCESS FULL | MY_TABLE | 2 | 26 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RN"<=7)
2 - filter(ROW_NUMBER() OVER ( ORDER BY "PRICE")<=7)
Adding to sqlvogel's good answer :
"As I understand, table input into FROM is relational"
No, table input into FROM is not relational. It is not relational because "table input" are tables and tables are not relations. The myriads of quirks and oddities in SQL eventually all boil down to that simple fact : the core building brick in SQL is the table, and a table is not a relation. To sum up the differences :
Tables can contain duplicate rows, relations cannot. (As a consequence, SQL offers bag algebra, not relational algebra. As another consequence, it is as good as impossible for SQL to even define equality comparison for its most basic building brick !!! How would you compare tables for equality given that you might have to deal with duplicate rows ?)
Tables can contain unnamed columns, relations cannot. SELECT X+Y FROM ... As a consequence, SQL is forced into "column identity by ordinal position", and as a consequence of that, you get all sorts of quirks, e.g. in SELECT A,B FROM ... UNION SELECT B,A FROM ...
Tables can contain duplicate column names, relations cannot. A.ID and B.ID in a table are not distinct column names. The part before the dot is not part of the name, it is a "scope identifier", and that scope identifier "disappears" once you're "outside the SELECT" it appears/is introduced in. You can verify this with a nested SELECT : SELECT A.ID FROM (SELECT A.ID, B.ID FROM ...). It won't work (unless your particular implementation departs from the standard in order to make it work).
Various SQL constructs leave people with the impression that tables do have an ordering to rows. The ORDER BY clause, obviously, but also the GROUP BY clause (which can be made to work only by introducing rather dodgy concepts of "intermediate tables with rows grouped together"). Relations simply are not like that.
Tables can contain NULLs, relations cannot. This one has been beaten to death.
There should be some more, but I don't remember them off the tip of the hat.

why does Oracle take another execution path on a view

We are using a view on a oracle 10g database to provide data to a .NET application. The nice part of this is that we need a number(12) in the view so the .NET apllication sees this as an integer. So in the select there is a cast(field as NUMBER(12)). So far so good the cost is if we use a where clause on some fields 0.9k. But now the funny part if we make an view of this and query the view with an where clause the cost goes from 0.9k to 18k.
In the explain plan suddenly all indexes are skipped and this results in lots of full table scans. Why does this happen when we use a view?
The simplified version of the problem:
SELECT CAST (a.numbers AS NUMBER (12)) numbers
FROM tablea a
WHERE a.numbers = 201813754;
explain plan:
Plan
SELECT STATEMENT ALL_ROWSCost: 1 Bytes: 7 Cardinality: 1
1 INDEX UNIQUE SCAN INDEX (UNIQUE) TAB1_IDX Cost: 1 Bytes: 7 Cardinality: 1
No problem index hit
If we put the above query in a view and execute the same query:
SELECT a.numbers
FROM index_test a
WHERE a.numbers = 201813754;
No index is used.
Explain plan:
Plan
SELECT STATEMENT ALL_ROWSCost: 210 Bytes: 2,429 Cardinality: 347
1 TABLE ACCESS FULL TABLE TABLEA Object Instance: 2 Cost: 210 Bytes: 2,429 Cardinality: 347
The issue is you're applying a function to the column (cast in this case). Oracle can't use the index you have as your query stands. To fix this you either need to remove the cast function from your view, or create a function based index:
create table tablea (numbers integer);
insert into tablea
select rownum from dual connect by level <= 1000;
create index ix on tablea (numbers);
-- query on base table uses index
explain plan for
SELECT * FROM tablea
where numbers = 1;
SELECT * FROM table(dbms_xplan.display(null,null, 'BASIC +PREDICATE'));
---------------------------------
| Id | Operation | Name |
---------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | INDEX RANGE SCAN| IX |
---------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("NUMBERS"=1)
create view v as
SELECT cast(numbers as number(12)) numbers FROM tablea;
-- the cast function in the view means we can't use the index
-- note the filter in below the plan
explain plan for
SELECT * FROM v
where numbers = 1;
SELECT * FROM table(dbms_xplan.display(null,null, 'BASIC +PREDICATE'));
------------------------------------
| Id | Operation | Name |
------------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | TABLE ACCESS FULL| TABLEA |
------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(CAST("NUMBERS" AS number(12))=1)
-- create the function based index and we're back to an index range scan
create index iv on tablea (cast(numbers as number(12)));
explain plan for
SELECT * FROM v
where numbers = 1;
SELECT * FROM table(dbms_xplan.display(null,null, 'BASIC +PREDICATE'));
---------------------------------
| Id | Operation | Name |
---------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | INDEX RANGE SCAN| IV |
---------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access(CAST("NUMBERS" AS number(12))=1)

Generalising Oracle Static Cursors

I am creating an OLAP-like package in Oracle where you call a main, controlling function that assembles its returning output table by making numerous left joins. These joined tables are defined in 'slave' functions within the package, which return specific subsets using static cursors, parameterised by the function's arguments. The thing is, these cursors are all very similar.
Is there a way, beyond generating dynamic queries and using them in a ref cursor, that I can generalise these. Every time I add a function, I get this weird feeling, as a developer, that this isn't particularly elegant!
Pseduocode
somePackage
function go(param)
return select myRows.id,
stats1.value,
stats2.value
from myRows
left join table(somePackage.stats1(param)) stats1
on stats1.id = myRows.id
left join table(somePackage.stats2(param)) stats2
on stats2.id = myRows.id
function stats1(param)
return [RESULTS OF SOME QUERY]
function stats2(param)
return [RESULTS OF A RELATED QUERY]
The stats queries all have the same structure:
First they aggregate the data in a useful way
Then they split this data into logical sections, based on criteria, and aggregate again (e.g., by department, by region, etc.) then union the results
Then they return the results, cast into the relevant object type, so I can easily do a bulk collect
Something like:
cursor myCursor is
with fullData as (
[AGGREGATE DATA]
),
fullStats as (
[AGGREGATE FULLDATA BY TOWN]
union all
[AGGREGATE FULLDATA BY REGION]
union all
[AGGREGATE FULLDATA BY COUNTRY]
)
select myObjectType(fullStats.*)
from fullStats;
...
open myCursor;
fetch myCursor bulk collect into output limit 1000;
close myCursor;
return output;
Filter operations can help build dynamic queries with static SQL. Especially when the column list is static.
You may have already considered this approach but discarded it for performance reasons. "Why execute every SQL block if we only need the results
from one of them?" You're in luck, the optimizer already does this for you with a FILTER operation.
Example Query
First create a function that waits 5 seconds every time it is run. It will help find which query blocks were executed.
create or replace function slow_function return number is begin
dbms_lock.sleep(5);
return 1;
end;
/
This static query is controlled by bind variables. There are three query blocks but the entire query runs in 5 seconds instead of 15.
declare
v_sum number;
v_query1 number := 1;
v_query2 number := 0;
v_query3 number := 0;
begin
select sum(total)
into v_sum
from
(
select total from (select slow_function() total from dual) where v_query1 = 1
union all
select total from (select slow_function() total from dual) where v_query2 = 1
union all
select total from (select slow_function() total from dual) where v_query3 = 1
);
end;
/
Execution Plan
This performance is not the result of good luck; it's not simply Oracle randomly executing one predicate before another. Oracle analyzes the bind variables before
run-time and does not even execute the irrelevant query blocks. That's what the FILTER operation below is doing. (Which is a poor name, many people generally
refer to all predicates as "filters". But only some of them result in a FILTER operation.)
select * from table(dbms_xplan.display_cursor(sql_id => '0cfqc6a70kzmt'));
SQL_ID 0cfqc6a70kzmt, child number 0
-------------------------------------
SELECT SUM(TOTAL) FROM ( SELECT TOTAL FROM (SELECT SLOW_FUNCTION()
TOTAL FROM DUAL) WHERE :B1 = 1 UNION ALL SELECT TOTAL FROM (SELECT
SLOW_FUNCTION() TOTAL FROM DUAL) WHERE :B2 = 1 UNION ALL SELECT TOTAL
FROM (SELECT SLOW_FUNCTION() TOTAL FROM DUAL) WHERE :B3 = 1 )
Plan hash value: 926033116
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 6 (100)| |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
| 2 | VIEW | | 3 | 39 | 6 (0)| 00:00:01 |
| 3 | UNION-ALL | | | | | |
|* 4 | FILTER | | | | | |
| 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
|* 6 | FILTER | | | | | |
| 7 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
|* 8 | FILTER | | | | | |
| 9 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter(:B1=1)
6 - filter(:B2=1)
8 - filter(:B3=1)
Issues
The FILTER operation is poorly documented. I can't explain in detail when it does or does not work, and exactly what affects it has on other parts of the query. For example, in the explain plan the Rows estimate is 3, but at run time Oracle should easily be able to estimate the cardinality is 1. Apparently the execution plan is not that dynamic, that poor cardinality estimate may cause later issues. Also, I've seen some weird cases where static expressions are not appropriately filtered. But if a query uses a simple equality predicate it should be fine.
This approach allows you remove all dynamic SQL and replace it with a large static SQL statement. Which has some advantages; dynamic SQL is often "ugly" and difficult to debug. But people only familiar with procedural programming tend to think of a single SQL statement as one huge God-function, a bad practice. They won't appreciate that the UNION ALLs create independent blocks of SQL
Dynamic SQL is Still Probably Better
In general I would recommend against this approach. What you have is good because it looks good. The main problem with dynamic SQL is that people don't treat it like real code; it's not commented or formatted and ends up looking like a horrible mess that nobody can understand. If you are able to spend the extra time to generate clean code then you should stick with that.

is there a tricky way to optimize this query

I'm working on a table that has 3008698 rows
exam_date is a DATE field.
But queries I run want to match only the month part. So what I do is:
select * from my_big_table where to_number(to_char(exam_date, 'MM')) = 5;
which I believe takes long because of function on the column. Is there a way to avoid this and make it faster? other than making changes to the table? exam_date in the table have different date values. like 01-OCT-10 or 12-OCT-10...and so on
I don't know Oracle, but what about doing
WHERE exam_date BETWEEN first_of_month AND last_of_month
where the two dates are constant expressions.
select * from my_big_table where MONTH(exam_date) = 5
oops.. Oracle huh?..
select * from my_big_table where EXTRACT(MONTH from exam_date) = 5
Bear in mind that since you want approximately 1/12th of all the data, it may well be more efficient for Oracle to perform a full table scan anyway. This may explain why performance was worse when you followed harpo's advice.
Why? Suppose your data is such that 20 rows fit on each database block (on average), so that you have a total of 3,000,000/20 = 150,000 blocks. That means a full table scan will require 150,000 block reads. Now about 1/12th of the 3,000,000 rows will be for month 05. 3,000,000/12 is 250,000. So that's 250,000 table reads if you use the index - and that's ignoring the index reads that will also be required. So in this example the full table scan does a lot less work than the indexed search.
Bear in miond that there are only twelve distinct values for MONTH. So unless you have a strongly clustered set of records (say if you use partitioining) it is possible that using an index is not necessarily the most efficient way of querying in this fashion.
I didn't find that using EXTRACT() lead the optimizer to use a regular index on my date column but YMMV:
SQL> create index big_d_idx on big_table(col3) compute statistics
2 /
Index created.
SQL> set autotrace traceonly explain
SQL> select * from big_table
2 where extract(MONTH from col3) = 'MAY'
3 /
Execution Plan
----------------------------------------------------------
Plan hash value: 3993303771
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 23403 | 1028K| 4351 (3)| 00:00:53 |
|* 1 | TABLE ACCESS FULL| BIG_TABLE | 23403 | 1028K| 4351 (3)| 00:00:53 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(EXTRACT(MONTH FROM INTERNAL_FUNCTION("COL3"))=TO_NUMBER('M
AY'))
SQL>
What definitely can persuade the optimizer to use an index in these scenarios is building a function-based index:
SQL> create index big_mon_fbidx on big_table(extract(month from col3))
2 /
Index created.
SQL> select * from big_table
2 where extract(MONTH from col3) = 'MAY'
3 /
Execution Plan
----------------------------------------------------------
Plan hash value: 225326446
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 23403 | 1028K| 475 (0)|00:00:06|
| 1 | TABLE ACCESS BY INDEX ROWID| BIG_TABLE | 23403 | 1028K| 475 (0)|00:00:06|
|* 2 | INDEX RANGE SCAN | BIG_MON_FBIDX | 9361 | | 382 (0)|00:00:05|
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(EXTRACT(MONTH FROM INTERNAL_FUNCTION("COL3"))=TO_NUMBER('MAY'))
SQL>
The function call means that Oracle won't be able to use any index that might be defined on the column.
Either remove the function call (as in harpo's answer) or use a function based index.

Resources