Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I would like to know if there are general rules for creating an index or not.
How do I choose which fields I should include in this index or when not to include them?
I know its always depends on the environment and the amount of data, but I was wondering if we could make some globally accepted rules about making indexes in Oracle.
The Oracle documentation has an excellent set of considerations for indexing choices: http://download.oracle.com/docs/cd/B28359_01/server.111/b28274/data_acc.htm#PFGRF004
Update for 19c: https://docs.oracle.com/en/database/oracle/oracle-database/19/tgdba/designing-and-developing-for-performance.html#GUID-99A7FD1B-CEFD-4E91-9486-2CBBFC2B7A1D
Quoting:
Consider indexing keys that are used frequently in WHERE clauses.
Consider indexing keys that are used frequently to join tables in SQL statements. For more information on optimizing joins, see the section "Using Hash Clusters for Performance".
Choose index keys that have high selectivity. The selectivity of an index is the percentage of rows in a table having the same value for the indexed key. An index's selectivity is optimal if few rows have the same value. Note: Oracle automatically creates indexes, or uses existing indexes, on the keys and expressions of unique and primary keys that you define with integrity constraints.
Indexing low selectivity columns can be helpful if the data distribution is skewed so that one or two values occur much less often than other values.
Do not use standard B-tree indexes on keys or expressions with few distinct values. Such keys or expressions usually have poor selectivity and therefore do not optimize performance unless the frequently selected key values appear less frequently than the other key values. You can use bitmap indexes effectively in such cases, unless the index is modified frequently, as in a high concurrency OLTP application.
Do not index columns that are modified frequently. UPDATE statements that modify indexed columns and INSERT and DELETE statements that modify indexed tables take longer than if there were no index. Such SQL statements must modify data in indexes as well as data in tables. They also generate additional undo and redo.
Do not index keys that appear only in WHERE clauses with functions or operators. A WHERE clause that uses a function, other than MIN or MAX, or an operator with an indexed key does not make available the access path that uses the index except with function-based indexes.
Consider indexing foreign keys of referential integrity constraints in cases in which a large number of concurrent INSERT, UPDATE, and DELETE statements access the parent and child tables. Such an index allows UPDATEs and DELETEs on the parent table without share locking the child table.
When choosing to index a key, consider whether the performance gain for queries is worth the performance loss for INSERTs, UPDATEs, and DELETEs and the use of the space required to store the index. You might want to experiment by comparing the processing times of the SQL statements with and without indexes. You can measure processing time with the SQL trace facility.
There are some things you should always index:
Primary Keys - these are given an index automatically (unless you specify a suitable existing index for Oracle to use)
Unique Keys - these are given an index automatically (ditto)
Foreign Keys - these are not automatically indexed, but you should add one to avoid performance issues when the constraints are checked
After that, look for other columns that are frequently used to filter queries: a typical example is people's surnames.
From the 10g Oracle Database Application Developers Guide - Fundamentals, Chapter 5:
In general, you should create an index on a column in any of the following situations:
The column is queried frequently.
A referential integrity constraint exists on the column.
A UNIQUE key integrity constraint exists on the column.
Use the following guidelines for determining when to create an index:
Create an index if you frequently want to retrieve less than about 15% of the rows in a large table. This threshold percentage varies greatly, however, according to the relative speed of a table scan and how clustered the row data is about the index key. The faster the table scan, the lower the percentage; the more clustered the row data, the higher the percentage.
Index columns that are used for joins to improve join performance.
Primary and unique keys automatically have indexes, but you might want to create an index on a foreign key; see Chapter 6, "Maintaining Data Integrity in Application Development" for more information.
Small tables do not require indexes; if a query is taking too long, then the table might have grown from small to large.
Some columns are strong candidates for indexing. Columns with one or more of the following characteristics are good candidates for indexing:
Values are unique in the column, or there are few duplicates.
There is a wide range of values (good for regular indexes).
There is a small range of values (good for bitmap indexes).
The column contains many nulls, but queries often select all rows having a value. In this case, a comparison that matches all the non-null values, such as:
WHERE COL_X >= -9.99 *power(10,125)
is preferable to
WHERE COL_X IS NOT NULL
This is because the first uses an index on COL_X (assuming that COL_X is a numeric column).
Columns with the following characteristics are less suitable for indexing:
There are many nulls in the column and you do not search on the non-null values.
Wow, that's just such a huge topic, it's hard to answer in this format. I srtongly recommend this book.
Relational Database Index Design and the Optimizers
by Tapio Lahdenmaki
You don't just use indexes to make table access faster, sometimes you make indexes to avoid table access altogether. Something not mentioned yet but vital.
There's a whole science to this if you really want to make your database perform maximally.
Ah, one specific optimization to Oracle is building reverse key indexes. If you have a PK index of a monoatomically increasing value, like a sequence, and you have highly concurrent inserts and don't plan to range scan that column then make it a reverse key index.
See how specific these optimizations can be?
Look into Database Normalization - you'll find a lot of good, industry standard rules about what keys should exist, how databases should be related, and hints on indexes.
-Adam
Usually one puts the ID columns up front and those usually identify the rows uniquely. A combination of columns can also do the same thing. As an example using cars... tags or license plates are unique and qualify for an index. They (the tags column) can qualify for the primary key. The owners name can qualify for an index if you are going to search on name. make of car really shouldn't get an index in the beginning as it's not going to vary too much. Indexes don't help if the data in the column doesn't vary too much.
Take a look at the SQL - what are the where clauses looking at. Those may need an index.
Measure. What is the issue - pages/queries taking too long ? what's being used for the queries. Create an index on those columns.
Caveats: indexes need time for updates and space.
and sometimes full table scans are quicker than an index. small tables can be scanned quicker than getting the index and then hitting the table. Look at your joins.
Related
I´m currently working on optimzing my database schema in regards of index structures. As I´d like to increase my DDL performance I´m searching for potential drop candidates on my Oracle 12c system. Here´s the scenario in which I don´t know what the consequences for the query performance might be if I drop the index.
Given two indexes on the same table:
- non-unique, single column index IX_A (indexes column A)
- unique, combined index UQ_AB (indexes column A, then B)
Using index monitoring I found that the query optimizer didn´t choose UQ_AB, but only IX_A (probably because it´s smaller and thus faster to read). As UQ_AB contains column A and additionally column B I´d like to drop IX_A. Though I´m not sure if I get any performance penalties if I do so. Does the higher selectivity of the combined unique index have any influence on the execution plans?
It could do, though it's quite likely to be minor (usually). Of course it depends on various things, for example how large the values in column B are.
You can look at various columns in USER_INDEXES to compare the two indexes, such as:
BLEVEL: tells you the "height" of the index tree (well, height is BLEVEL+1)
LEAF_BLOCKS: how many data blocks are occupied by the index values
DISTINCT_KEYS: how "selective" the index is
(You need to have analyzed the table first for these to be accurate). That will give you an idea of how much work Oracle needs to do to find a row using the index.
Of course the only way to really be sure is to benchmark and compare timings or even trace output.
I have a table in SYBASE which has around 1mio rows. This table currently does not have any index created and I would like to create one now. My questions are
What precautions should I take before creating an index?
Does this process require more tablespace to be allocated?
Any other performance considerations I should take care of?
Cheers
Ranjith
From manual.
When to index
Use the following general guidelines:
If you plan to do manual insertions into the IDENTITY column, create
a unique index to ensure that the inserts do not assign a value that
has already been used.
A column that is often accessed in sorted order, that is, specified in the order by clause, probably should be indexed so that
Adaptive Server can take advantage of the indexed order.
Columns that are regularly used in joins should always be indexed, since the system can perform the join faster if the columns
are in sorted order.
The column that stores the primary key of the table often has a clustered index, especially if it is frequently joined to columns in
other tables. Remember, there can be only one clustered index per
table.
A column that is often searched for ranges of values might be a good choice for a clustered index. Once the row with the first value
in the range is found, rows with subsequent values are guaranteed to
be physically adjacent. A clustered index does not offer as much of
an advantage for searches on single values.
When not to index
In some cases, indexes are not useful:
Columns that are seldom or never referenced in queries do not benefit
from indexes, since the system seldom has to search for rows on the
basis of values in these columns.
Columns that can have only two or three values, for example, "male" and "female" or "yes" and "no", get no real advantage from
indexes.
Try
sp_spaceused tablename, 1
Here is link to documentation.
Yes - Updating statistics about indexes.
Here is link to documentation.
We have a huge table which are 144 million rows available right now and also increasing 1 million rows each day.
I would like to create a partitioning table on Oracle 11G server but I am not aware of the techniques. So I have two question :
Is it possible to create a partitioning table from a table that don't have PK?
What is your suggestion to create a partitioning table like huge records?
Yes, but keep in mind that the partition key must be a part of PK
Avoid global indexes
Chose right partitioning key - have it prepared for some kind of future maintenance ( cutting off oldest or unnecessary partitions, placing them in separate tablespaces... etc)
There are too many things to consider.
"There are several non-unique index on the table. But, the performance
is realy terrible! Just simple count function was return result after
5 minutes."
Partitioning is not necessarily a performance enhancer. The partition key will allow certain queries to benefit from partition pruning i.e. those queries which drive off the partition key in the WHERE clause. Other queries may perform worse, if there WHERE clause runs against the grain of the partition key,
It is difficult to give specific advice because the details you've posted are so vague. But here are some other possible ways of speeding up queries on big tables:
index compression
parallel query
better, probably compound, indexes.
Assumptions:
I have a number of tables comprised of facts and foreign keys ('dimensional' and 'key-value' type). For example, ENCOUNTER:
ID - primary key
dimensions
LOCATION_ID
PATIENT_ID
key-value
TYPE_ID
STATUS_ID
PATIENT_CLASS_ID
DISPOSITION_ID
...
facts
ADMISSION_DATE
DISCHARGE_DATE
...
I don't have the option to create a data warehouse
I would like to simplify the data structure for reporting
My approach is to create a number of pseudo-dimensional views ('D_LOCATION' based on the DEPARTMENT and LOCATION tables) and pseudo-fact views ('F_ENCOUNTER' based on ENCOUNTER table). In the pseudo-fact view, I would JOIN the key-value tables (e.g. STATUS, PATIENT_CLASS) to the fact table to include the name fields (e.g. STATUS.NAME, PATIENT_CLASS.NAME).
Questions:
If a query selects a subset of all of the fields from F_ENCOUNTER (i.e. not all of the key-value.name fields), is the Oracle 10g optimizer smart enough to exclude some of the key-value table joins (i.e. the ones that aren't included in the query)?
Is there anything that I can do to optimize this architecture (other than indices)
Is there another approach?
** edit **
Goals (in order of importance):
reduce query complexity; increase query consistency; decrease report-development time
optimize query-processing
minimize administrator burden
decrease storage
One optimization suggestion is not to use key-value pair tables. The concept of a Dimension table is that each record should contain all information about that concept without needing to join to normalized tables - i.e. turning a star schema into a snowflake schema.
While values might be repeated across dimension table records, it has the advantage of fewer joins in your reporting queries. Denormalizing tables in this way might seem counter intuitive but where performance is paramount it is usually the best solution.
I don't believe Oracle would exclude any joins done in the view, because the joins can impact the number of rows returned. (As when an inner join fails to match any rows, making the whole result set empty.)
What are the goals of your optimization? Query speed? query simplicity? storage efficiency? If you can sacrifice storage efficiency for better query performance, then replace the key-value references with the values themselves (TYPE_NAME instead of TYPE_ID, PATIENT_CLASS_NAME instead of PATIENT_CLASS_ID, etc.).
[Edit:] If the original architecture cannot be modified, consider using a materialized view. It would essentially pre-compute the joins and store the result set, giving you speedy query time at the cost of extra storage space and possibly-not-fresh data. You can control the latter by specifying an appropriate refresh policy. See http://en.wikipedia.org/wiki/Materialized_view and http://download.oracle.com/docs/cd/B10500_01/server.920/a96520/mv.htm for further details.
As always, I apologize if this is a stupid question (two questions, actually). I'm not a DBA, so I know very little about indexes. My questions are:
Is there any cutoff point (in terms of number of rows) at which an index would be pointless? For example, is there any benefit to an index on a lookup table with 10-20 rows?
I've read some things about covering indexes in Oracle, and the concept makes sense in that the data can be retrieved directly from the index and a trip to the table is unnecessary. How can I tell if an index is a covering index? Is this a value set when the index is created, or by default based on the rows that the index includes?
I hope this makes sense.
Richard Foote has a series of blog posts on indexes for small tables. The short answer is probably not (but the long answer is much more interesting).
A covering index is a general term for an index that contains all the columns that are part of either the SELECT list or the WHERE clause for a table. It is not a property of an index-- any index can be a covering index for some query. It is something that is specific to a query and to the indexes available to the optimizer.
Small tables can be created as an "indexed oriented table" in which case the table and the index are the same "object" in storage. I believe a primary key is required for an IOT. Before IOTs were available it was in many cases a judgement call. Certainly, if the index was almost as big or bigger than the table it's useless unless it's being used to enforce uniqueness.
another Richard Foote link
You have to look at the query being run and compare it to the index. An index on (A, B) covers:
select B from T where A = 3