I have a large Oracle table with an indexed date_time field: "DISCONNECT_DATE"
When I use the following where clause my query runs quickly:
DISCONNECT_DATE > TO_DATE('01-DEC-2016', 'DD-MON-YYYY') AND
DISCONNECT_DATE < TO_DATE('01-JAN-2017', 'DD-MON-YYYY')
When I use the following where clause my query runs (very) slowly:
extract(month from disconnect_date) = '12' and
extract(year from disconnect_date) = '2016'
They are both more or less equivalent in their intentions. Why does the former work and the later not? (I don't think I have this problem in SQL SERVER)
(I am using PL SQL Developer to write the query)
The issue is the use of indexes. In the first, all the functions are on the "constant" side, not on the "column" side. So, Oracle can readily see that an index can be applied.
The logic that does indexing, though, doesn't understand extract(), so the index doesn't get used. If you want to use that construct, you can create an index on function calls:
create index idx_t_ddyear_ddmonth on t(extract(month from disconnect_date), extract(year from disconnect_date));
Note: extract() returns a number not a string, so you should get rid of the single quotes. Mixing data types can also confuse the optimizer.
Related
I am a SQL Server guy and just started working on Netezza, one thing pops up to me is a daily query to find out the size of a table filtered out by year: 2016,2015, 2014, ...
What I am using now is something like below and it works for me, but I wonder if there is a better way to do it:
select count(1)
from table
where extract(year from datacolumn) = 2016
extract is a built-in function, applying a function on a table with size like 10 billion+ is not imaginable in SQL Server to my knowledge.
Thank you for your advice.
The only problem i see with the query is the where clause which executes a function on the 'variable' side. That effectively disables zonemaps and thus forces netezza to scan all data pages, not only those with data from that year.
Instead write something like:
select count(1)
from table
where datecolumn between '2016-01-01' and '2016-12-31'
A more generic alternative is to create a 'date dimension table' with one row per day in your tables (and a couple of years into the future)
This is an example for Postgres: https://medium.com/#duffn/creating-a-date-dimension-table-in-postgresql-af3f8e2941ac
This enables you to write code like this:
Select count(1)
From table t join d_date d on t.datecolumn=d.date_actual
Where year_actual=2016
You may not have the generate_series() function on your system, but a 'select row_number()...' can do the same trick. A download is available here: https://www.ibm.com/developerworks/community/wikis/basic/anonymous/api/wiki/76c5f285-8577-4848-b1f3-167b8225e847/page/44d502dd-5a70-4db8-b8ee-6bbffcb32f00/attachment/6cb02340-a342-42e6-8953-aa01cbb10275/media/generate_series.tgz
A couple of further notices in 'date interval' where clauses:
Those columns are the most likely candidate for a zonemaps optimization. Add a 'organize on (datecolumn)' at the bottom of your table DDL and organize your table. That will cause netezza to move around records to pages with similar dates, and the query times will be better.
Furthermore you should ensure that the 'distribute on' clause for the table results in an even distribution across data slices of the table is big. The execution of the query will never be faster than the slowest dataslice.
I hope this helps
I have a query, something like
select * from table1 where :iparam is null or iparam = field1;
On field1 there is a non-unique index, but oracle (11g) don't want to use it. As i understand, it optimize query not in run-time, but at compiling. I'm using such query in stored procedures. I wonder, if there is a way, to tell oracle, to use an indexes?
I know about "hints" but i would like to use something on all project, like some optimizer argument, to optimize queries in run-time.
where :iparam is null or :iparam = field1;
Oracle has no way of knowing in advance if you will pass NULL value for :iparam
If you do, full scan is the best way to access data. If you don't, index might be better. You can split this statement in two parts using IF, then there will be no ambiguity.
If you have a lot of fields to compare, dynamic sql migh be a better way.
IF :param1 IS NOT NULL THEN
v_sql := v_sql||' and field1 = :param1';
ELSE
v_sql := v_sql||' and nvl(:param1,1) = 1';
END IF;
ELSE part is for easyer usage of USING.
It is not true that the execution plan is determined at the time of package compilation. It will be determined just before the query is actually executed.
How the optimizer decides to run the query depends on many things. Foremost the availability of statistics. These give the optimizer something to go on. How many records are in the table. How many different values are in the index.
Here is an article that goes into more detail:
http://joco.name/2014/01/05/why-wouldnt-oracle-use-a-perfectly-valid-index/
In practice, are the following SQL queries equally efficient?
select * from aTable where myColumn LIKE '%'
select * from aTable
I known that in principle the first clause implies a table/index search whereas the second doesn't, but I am wondering whether SQL Server and Oracle these days (SS 2008 onwards, and Oracle 10g onwards) are intelligent enough to discard or ignore the redundant WHERE clause.
The reason I ask is that I have an application to look at that dynamically generates sql queries, and it is inserting clauses such myColumn LIKE '%' and I wonder if I should spend scarce time worrying about that.
well this problem is general in sql server ce
i have indexes on all the the fields.
also the same query but with ID IN ( list of int ids) is pretty fast.
i tried to change the query to OUTER Join but this just make it worse.
so any hints on why this happen and how to fix this problem?
That's because the index is not really helpful for that kind of query, so the database has to do a full table scan. If the query is (for some reason) slower than a simple "SELECT * FROM TABLE", do that instead and filter the unwanted IDs in the program.
EDIT: by your comment, I recognize you use a subquery instead of a list. Because of that, there are three possible ways to do the same (hopefully one of them is faster):
Original statement:
select * from mytable where id not in (select id from othertable);
Alternative 1:
select * from mytable where not exists
(select 1 from othertable where mytable.id=othertable.id);
Alternative 2:
select * from mytable
minus
select mytable.* from mytable in join othertable on mytable.id=othertable.id;
Alternative 3: (ugly and hard to understand, but if everything else fails...)
select * from mytable
left outer join othertable on (mytable.id=othertable.id)
where othertable.id is null;
This is not a problem in SQL Server CE, but overall database.
The OPERATION IN is sargable and NOT IN is nonsargable.
What this mean ?
Search ARGument Able, thies mean that DBMS engine can take advantage of using index, for Non Search ARGument Ablee the index can't be used.
The solution might be using filter statement to remove those IDs
More in SQL Performance Tuning by Peter Gulutzan.
ammoQ is right, index does not help much with your query. Depending on distribution of values in your ID column you could optimise the query by specifying which IDs to select rather than not to select. If you end up requesting say more than ~25% of the table index will not be used anyway though because for nonclustered indexed (which is the only type of indexes which SQL CE supports if memory serves) it would be cheaper to scan the table. Otherwise (if the query is actually selective) you could re-write query with ID ranges to select ('union all' may work better than 'or' to combine ranges if SQL CE supports 'union all', not sure)
To optimize SELECT queries, I run them both with and without an index and measure the difference. I run a bunch of different similar queries and try to select different data to make sure that caching doesn't throw off the results. However, on very large tables, indexes take a really long time to create, and I have several different ideas about what indexes would be appropriate.
Is it possible in Oracle (or any other database for that matter) to perform a query but tell the database to not use a certain index when performing the query? Or just turn off the index entirely, but be able to easily switch it back on without having to re-index the entire table? This would make it much easier to test, since I can create all the indexes I'm thinking about all at once, then try my queries using different ones.
Alternatively, is there any better way to go about optimizing queries on large tables and know which indexes would be best to create?
You can set index visibility in 11g -
ALTER INDEX idx1 [ INVISIBLE | VISIBLE ]
this makes it unusable by the optimizer, but oracle still updates the index when data is added or removed. This makes it easy to test performance with the index disabled without having to remove & rebuild the whole index.
See here for the oracle docs on index visibility
You can use the NO_INDEX hint in the queries to ignore the indexes - see docs for further details. The SQL Access Advisor is an Oracle utility that will recommend indexing strategies.
Well you can write the query in such a way that it wont use index(using expression instead of a value)
For example
Select * from foobar where column1 = 'result' --uses index on column1
To avoid using index for a number and varchar
Select * from foobar where column1 + 0 = 5 -- simple expression to disable the index
Select * from foobar where column1 || '' = 'result' --simple expression to disable the index
Or you can just use NVL to disable the index in the query without worrying about the column's data type
Select * from foobar where nvl(column1,column1) = 'result' --i love this way :D
Similarly you can use index hints
like /* Index(E employee_id) */ to use indexes.
P.S. This is all the paraphrased from Dan Tow's Book SQL Tuning. I started reading it a few days back :)