Best way to index a table in Oracle - oracle

Is it a good practice to have multiple index for same column in a table?
Eg. There is table A with column col1,col2,col3,col4,col5,col6
Indexes
Index1 on (col1,col2,col3)
Index2 on (col4,col2,col6)
Index3 on (col2,col1,col3,col4,col5)
In this case col1,col2,col3 are part of more than one index for the table A. Is it not enough to have each column part of single index?
What is the use of having same column in multiple index. please clarify.

It depends;)
In a lot of cases it's very usefull to have several indexes containing the same column.
See also this article at Ask Tom:
Column order in Index
Whether an index helps depends a lot on your data and your query. Especially percentage of records, your query returns, is important.
The following sample is a little bit simplified, but in general it's how the system works:
Imagine, your database stores 10 records on each database block and your query has to return 10% of all records. In this case every 10th record has to be returned. In the worst case, you have to read every database block (full table scan). In this case an Index will slow down the query because you have to read the index additionally.
There are different opinions when to use an Index. My rule of thumb is this:
If 1% or less is selected, then an Index is good
If 10% or more is selected, then a full table scan is good
else (1%-9%) a more detailed analysis is required

Related

Why does the same select statement have different costs in Oracle?

Recently I used Oracle 11g database to do my homework. I had 12 tables, like trip_data_11 and trip_data_12.
They have same structure and the number of records is almost the same. I created the same indexes on each table.
So for trip_data_11 table:
create index pick_add_11 on trip_data_11(pickup_longitude,pickup_latitude);
create index drop_add_11 on trip_data_11(dropoff_longitude,dropoff_latitude);
The same operation to trip_data_12.
Then I used the following select statement to select the taxi numbers per day.
SELECT
COUNT(DISTINCT(td.medallion)) AS taxi_num
FROM
SYS.TRIP_DATA_11 td
WHERE
(td.pickup_longitude >= -74.2593 AND td.pickup_longitude <= -73.7011
AND td.pickup_latitude >= 40.4770 AND td.pickup_latitude <= 40.9171
)
AND
(td.dropoff_longitude >= -74.2593 AND td.dropoff_longitude <= -73.7011
AND td.dropoff_latitude >= 40.4770 AND td.dropoff_latitude <= 40.9171
)
AND
td.trip_distance > 0
AND
td.passenger_count > 0
GROUP BY
regexp_substr(td.pickup_datetime,'\d{4}-\d{2}-\d{2}')
ORDER BY
regexp_substr(td.pickup_datetime,'\d{4}-\d{2}-\d{2}');
It costs 38sec。When I changed the table name to SYS.TRIP_DATA_12, the problem coming, it costs more than 2 hours.
What's more, it did not end. I don't know why.
Today I ask my classmate and he said: clear the cache. So I used the following statements to do it.
alter system flush shared_pool;
alter system flush buffer_cache;
alter system flush global context;
Now when I use the same select statement for SYS.TRIP_DATA_11 I get the same poor performance like SYS.TRIP_DATA_12. Why?
It seems like your classmate was having a good joke at your expense.
Clearly your query was only performing well because you had a warm buffer cache full of all the data you needed from TRIP_DATA_11. By flushing the caches you have zapped all that, and now you have the same bad performance for all tables.
Tuning queries is hard, because there are lots of possibilities. Please read the documentation on it.
To pick just one thing: you're searching ranges, which is problematic. How many rows fill -74.2593 to -73.7011 ? It might be a lot more than say -71.00 to -68.59 even though that's a broader range. Understanding your data - its volume, its distribution and its skew - is crucial.
As a first step learn how to use EXPLAIN PLAN. Find out more. To get better plans, gather statistics on your tables and their indexes, using DBMS_STATS package. Find out more.
One tip. Oracle only uses one index to access a table. So it will choose pick_add_11 or drop_add_11 but not both. It will then read all the matching records from the table and filter them by the other criteria. You may get much better performance from a index designed to service this query:
create index add_11 on trip_data_11
(pickup_longitude
, pickup_latitude
, dropoff_longitude
, dropoff_latitude
, trip_distance
, passenger_count )
;
The select statement will execute the entire filter against this index and only touch the table to get the MEDALLION values. (You could add medallion to the index too). Experiment with the column order. As latitude has a narrower range than longitude probably that should go first; maybe drop-off value should appear before pick-up. You want an index in which the greatest number of related records are clustered together.
Indexes like this can be an overhead, so we wouldn't want to maintain too many of them in real life. But they are a valuable technique for tuning expensive queries which are run frequently.
Oh, and #Justin's right: don't use SYS for doing application work. Even for a school assignment you should create a fresh schema and create your tables, etc in that.

IOT vs Heap in Oracle. Help me make choice

I've read many information about IOT, and now in my head gruel...
Pls, help me solve question.
Have table, that have structure:
ID (PK); ID_DRUG_NAME (a); ID_FROM (b); ID_PROVIDER (c); DELETED;
The data from this table is never deleted but only marked that are removed.
Many queries uses ID, another queries uses a,b or a,c or a,b,c.
I want recreate this table using operator ORGANIZATION INDEX.
How it will be profitable?
How to rightly create a primary key and indexes?
What pitfalls do I get?
Index-organized tables (IOT) are best used when there is a single access-path. You've identified two different lead columns, so an IOT is probably not a good choice.
The issue here is that, if you make it an IOT, you have to choose one of the two columns (ID or ID_DRUG_NAME) that you'll frequently be filtering on to index. Theoretically, you could still add a second index on an IOT, but it's almost always a bad idea. An IOT with a second index is typically performs worse than if the second index doesn't exist, even when querying against the column in the second index.

Using Partitioning and Indexing on Same Column in Oracle is there any benefit out there

We are having a database design where we have table on which we have 1 Day Interval Partitioning on the column named as "5mintime" and on the same column we have created index also.
"5mintime" column can have data such as 1-Mar-2011,2-Mar-2011, in short there is no time component in it and from the UI also the user can select only one day period as minimum date.
My question is that while firing the select queries is there any advantage gained because of indexes since the partition is already there, on the flip side if i remove the indexes the insertion will be come faster, so any help on this would be greatly appreciated.
If I understand you right, then I think there's no need for the index:
A local index is indexed for every partition, which in your case has the same value in all rows (ie: 1-Mar-2011 in the 1-Mar-2011 partition, 2-Mar-2011 in the 2-Mar-2011 partiotion and so on).
A global Index will actually index the whole table but will find a whole partiotion, which is also not usefull since you already have partiones ...
But, why not check it?
If each day's data goes into its own partition and you can never search within days, but only for entire days worth of data, then, no, I don't see this index adding any value.
You can confirm whether or not SQL queries are using this index by enabling monitoring:
alter index myindex monitoring usage;
And then check to see if it's been used by querying v$object_usage for it some time later.

Force oracle to use index

Is there any way to force oracle to use index except Hints?
No. And if the optimizer doesn't use the index, it usually has a good reason for it. Index usage, if the index is poor, can actually slow your queries down.
Oracle doesn't use an index when it thinks the index is
disabled
invalid (for example, after a huge data load and the statistics about the index haven't been updated)
won't help (for example, when there are only two different values in 5 million rows)
So the first thing to check is that the index is enabled, then run the correct GATHER command on your index/table/schema. When that doesn't help, Oracle thinks that loading your index will actually take more time than loading the actual row values. In this case, add more columns to the index to make it appear more "diverse".
You might take a look at oracle stored outlines. You can take an existing query and create a stored outline and tweak the query just like hints. It is just very hard to use. Do some research before you decide to implement stored outlines.
You can add hints into the query that will cause it to look more favorably on one index over another index.
In general if you have collected good statistics on all the tables and indexes Oracle usually implements very good execution plans.
If your query doesn't include the indexed field in its conditions, then the DB would be foolish to use the index. Thus, I second Donnie's answer.
Yes, technically, you can force Oracle to use an index (without hints), in one scenario: if the table is an index-organized table, then logically the only way to query the table is via its index because there is no table to query.

Does the order of columns on a covered index in Sybase affect select performance?

We have a large table, with several indices (say, I1-I5).
The usage pattern is as follows:
Application A: all select queries 100% use indices I1-I4 (assume that they are designed well enough that they will never use I5).
Application B: has only one select query (fairly frequently run), which contains 6 fields and for which a fifth index I5 was created as a covered index.
The first 2 fields of the covered index are date, and a security ID.
The table contains rows for ~100 dates (in date order, enforced by a clustered index I1), and tens of thousands of security identifiers.
Question: dies the order of columns in the covered index affect the performance of the select query in Application B?
I.e., would the query performance change if we switched around the first two fields of the index (date and security ID)?
Would the query performance change if we switch around one of the last fields?
I am assuming that the logical IOs would remain un-affected by any order of fields in the covered index (though I'm not 100% sure).
But will there be other performance effects? (Optimizer speed, caching, etc...)
The question is version-generic, but if it matters, we use Sybase 12.
Unfortunately, the table is so huge that actually changing the index in practice and quantitatively confirming the effects of the change is extremely difficult.
It depends. If you have a WHERE clause such as the following, you will get better performance out an index on (security_ID, date_column) than the converse:
WHERE date_column BETWEEN DATE '2009-01-01' AND DATE '2009-08-31'
AND security_ID = 373239
If you have a WHERE clause such as the following, you will get better performance out of an index on (date_column, security_ID) than the converse:
WHERE date_column = DATE '2009-09-01'
AND security_ID > 499231
If you have a WHERE clause such as the following, it really won't matter very much which column appears first:
WHERE date_column = DATE '2009-09-13'
AND security_ID = 211930
We'd need to know about the selectivity and conditions on the other columns in the index to know if there are other ways of organizing your index to gain more performance.
Just like your question is version generic, my answer is DBMS-generic.
Unfortunately, the table is so huge that actually changing the index in practice and quantitatively confirming the effects of the change is extremely difficult.
The problem is not the size of the table. Millions of rows is nothing for Sybase.
The problem is an absence of a test system.

Resources