My sample oracle query structure is this:
SELECT <LIST_OF_COLUMNS> FROM <TABLE_NAME>
WHERE COLUMN_01 = <SOMETHING> AND COLUMN_02 = <SOMETHING> AND COLUMN_03 = <SOMETHING>
The table has over 1 Million records. I have indexed COLUMN_01, COLUMN_02 and COLUMN_03 separately. The above query is working fine and provide results as expected.
If I make COLUMN_01, COLUMN_02 and COLUMN_03 (all columns in WHERE clause) as composite index without changing existing indexes, will it improve performance? If so, is there an order for composite index columns?
If I use OR instead of AND like this query, will it improve performance?
SELECT <LIST_OF_COLUMNS> FROM <TABLE_NAME>
WHERE COLUMN_01 = <SOMETHING> OR COLUMN_02 = <SOMETHING> OR COLUMN_03 = <SOMETHING>
If you create a composite index based on the three columns AND
you never query the table using only one column in the WHERE
clause, you won't need the single column indexes. Otherwise, it
depends on what columns participate in a query. And this case should
be well analyzed and tested.
Columns order in a composite index does matter. Columns should be ordered by uniqueness where the least distinct column goes first. It helps trim down the number of rows matched the query predicate and thus speed up performance.
It also should be noticed that Oracle can use a composite index with queries that do not contain all the index columns in their predicates.
For example:
create index idx1 on table_name (col1, col2, col3);
/*In this query Oracle can use index idx1 as a standard one-column index,
because col1 is the first column in the index*/
select * from table_name where col1 = 'some_value';
/*Here Oracle can still use the composite index,
but in this case it will use INDEX SKIP SCAN (assuming col1 equals ANY value),
which reduces query performance comparing to an ordinary index*/
select * from table_name where col2 = 'some_value1' and col3 = 'some_value2';
OR or AND operators do not really matter here. What is more important is the number of rows which match the given predicate.
If I make COLUMN_01, COLUMN_02 and COLUMN_03 (all columns in WHERE clause) as composite index without changing existing indexes, will it improve performance?
Probably. One index which satisfies all WHERE criteria serves as a complete access path and hence is more effective than a single column index access path. The optimizer chooses one index, so it will index read all the rows matching (say) COLUMN_02 criterion and filter those rows using the other columns' criteria.
The price you pay for this improvement in performance is the overhead of maintaining an additional index. So you should consider whether you need all three single column indexes (for other queries).
is there an order for composite index columns?
Yes. Put them in ascending order of distinct values. The leading index column should be the least discriminating column. Having a unique key as the leading column is probably a disaster, although there are edge cases, so be sure to benchmark.
If I use OR instead of AND like this query, will it improve performance?
You're going to be returning more rows, which in itself is more work. It is also hard to use indexes in such a situation, so most likely you're facing a Full Table Scan. But why not try it and see what happens?
If I make COLUMN_01, COLUMN_02 and COLUMN_03 (all columns in WHERE clause) as composite index without changing existing indexes, will it improve performance?
For this query: likely. For INSERT/UPDATE/DELETE: the performance will deteriorate.
So you'll need to measure and see whether improvement in some queries justifies the deterioration in others.
If so, is there an order for composite index columns?
Not for this query. If you prefix-compress the index, you may choose the order that compresses the best, otherwise it shouldn't matter much.
However, there may be other queries that use only some of the indexed columns, in which case you'd want to make sure the columns that are actually used are at the leading edge of the index.
If I use OR instead of AND like this query, will it improve performance?
No. Separate indexes (that you already have) are what is needed in this case.
Related
I have table A(id,name,code)
I have sql statement:
Select * from A where upper(code || name) like upper('%<search text>%');
How to create an index so that the following statement has an index?
Question for two option: table partitioned, and table not partitioned
Thanks & BR
Do you have a performance issue or is this just a hypothetical question?
An index is unlikely to help with this example: a full table scan will probably be the quickest solution. Why? Your table has 3 columns. The best index would be one that avoided looking in the table at all e.g.
create index ai on a (code, name, id);
But that needs to contain all the same data as the table plus a ROWID for each table row - so it is going to be bigger than the table and take longer to scan. You could try putting the columns in the index with the least selective first and using compression:
create index ai on a (code, name, id) compress;
Now the index may be smaller than the table - it depends on how selective the code and name columns are. If it is small enough, the optimizer might decide to use it instead of the table. It still contains all the IDs and ROWIDs so the reduction in size probably won't be dramatic. In the test case I set up the compressed index is about half the size of the table, yet Explain Plan shows the query has a higher cost if I use a hint to force it to use the index - maybe due to overheads of compression, I don't know.
You could look into Oracle Text and the CONTAINS expression - but then you would be writing a different query, not using LIKE.
Consider this Oracle docs about indexes, this about speed of insert and this question on StackOverflow lead me to conclusion that:
Indexes helps us locate information faster
Primary and Unique Keys are indexed automatically
Inserting with indexes can cause worse performance
However every time indexes are discussed there are only SELECT operations shown as examples.
My question is: are indexes used in INSERT and UPDATE operations? When and how?
My suggestions are:
UPDATE can use index in WHERE clause (if the column in the clause has index)
INSERT can use index when uses SELECT (but in this case, index is from another table)
or probably when checking integrity constraints
but I don't have such deep knowledge of using indexes.
For UPDATE statements, index can be used by the optimiser if it deems the index can speed it up. The index would be used to locate the rows to be updated. The index is also a table in a manner of speaking, so if the indexed column is getting updated, it obviously needs to UPDATE the index as well. On the other hand if you're running an update without a WHERE clause the optimiser may choose not to use an index as it has to access the whole table, a full table scan may be more efficient (but may still have to update the index). The optimiser makes those decisions at runtime based on several parameters such as if there are valid stats against the tables and indexes in question, how much data is affected, what type of hardware, etc.
For INSERT statements though the INSERT itself does not need the index, the index will also need to be 'inserted into', so will need to be accessed by oracle. Another case where INSERT can cause the index to be used is an INSERT like this:
INSERT INTO mytable (mycolmn)
SELECT mycolumn + 10 FROM mytable;
Insert statement has no direct benefit for index. But more index on a table cause slower insert operation. Think about a table that has no index on it and if you want to add a row on it, it will find table block that has enough free space and store that row. But if that table has indexes on it database must make sure that these new rows also found via indexes, So to add new rows on a table that has indexes, also need to entry in indexes too. That multiplies the insert operation. So more index you have, more time you need to insert new rows.
For update it depends on whether you update indexed column or not. If you are not updating indexed column then performance should not be affected. Index can also speed up a update statements if the where conditions can make use of indexes.
I have a massive table in which I can't do any more partitioning or sub-partitioning, nor am allowed to do any alter. I want to query its records by batches, and thought a good way would be using the last two digits from the account numbers (wouldn't have any other field splitting records as evenly).
I guess I'd need to at least index that somehow (remember I can't alter table to add a virtual column either).
Is there any kind of index to be used in such situation?
I am using Oracle 11gR2
You can use function based index:
create index two_digits_idx on table_name (substr(account_number, -2));
This index will work only in queries like that:
select ...
from table_name t ...
where substr(account_number, -2) = '25' -- or any other two digits
For using index, you need to use in a query the same expression like in an index.
I have a table that I expect to get 7 million records a month on a pretty wide table. A small portion of these records are expected to be flagged as "problem" records.
What is the best way to implement the table to locate these records in an efficient way?
I'm new to Oracle, but is a materialized view an valid option? Are there such things in Oracle such as indexed views or is this potentially really the same thing?
Most of the reporting is by month, so partitioning by month seems like an option, but a "problem" record may be lingering for several months theorectically. Otherwise, the reporting shuold be mostly for the current month. Would you expect that querying across all month partitions to locate any problem record would cause significant performance issues compared to usinga single table?
Your general thoughts of where to start would be appreciated. I realize I need to read up and I'll do that but I wanted to get the community thought first to make sure I read the right stuff.
One more thought: The primary key is a GUID varchar2(36). In order of magnitude, how much of a performance hit would you expect this to be relative to using a NUMBER data type PK? This worries me but it is out of my control.
It depends what you mean by "flagged", but it sounds to me like you would benefit from a simple index, function based index, or an indexed virtual column.
In all cases you should be careful to ensure that all the index columns are NULL for rows that do not need to be flagged. This way your index will contain only the rows that are flagged (Oracle does not - by default - index rows in B-Tree indexes where all index column values are NULL).
Your primary key being a VARCHAR2 GUID should make no difference, at least with regards to the specific flagging of rows in this question, indexes will point to rows via Oracle internal ROWIDs.
Indexes support partitioning, so if your data is already partitioned, your index could be set to match.
Simple column index method
If you can dictate how the flagging works, or the column already exists, then I would simply add an index to it like so:
CREATE INDEX my_table_problems_idx ON my_table (problem_flag)
/
Function-based index method
If the data model is fixed / there is no flag column, then you can create a function-based index assuming that you have all the information you need in the target table. For example:
CREATE INDEX my_table_problems_fnidx ON my_table (
CASE
WHEN amount > 100 THEN 'Y'
ELSE NULL
END
)
/
Now if you use the same logic in your SELECT statement, you should find that it uses the index to efficiently match rows.
SELECT *
FROM my_table
WHERE CASE
WHEN amount > 100 THEN 'Y'
ELSE NULL
END IS NOT NULL
/
This is a bit clunky though, and it requires you to use the same logic in queries as the index definition. Not great. You could use a view to mask this, but you're still duplicating logic in at least two places.
Indexed virtual column
In my opinion, this is the best way to do it if you are computing the value dynamically (available from 11g onwards):
ALTER TABLE my_table
ADD virtual_problem_flag VARCHAR2(1) AS (
CASE
WHEN amount > 100 THEN 'Y'
ELSE NULL
END
)
/
CREATE INDEX my_table_problems_idx ON my_table (virtual_problem_flag)
/
Now you can just query the virtual column as if it were a real column, i.e.
SELECT *
FROM my_table
WHERE virtual_problem_flag = 'Y'
/
This will use the index and puts the function-based logic into a single place.
Create a new table with just the pks of the problem rows.
I have a table which is basically a tree structure with a column parent_id and id.
parent_id is null for root nodes.
There is also a self referential foreign key, so that every parent_id has a corresponding id.
This table is mainly read-only with mostly infrequent batch updates.
One of the most common queries from the application which accesses this table is select ... where parent_id = X. I thought this might be faster if this table was index organised on parent_id.
However, I'm not sure how to index organise this table if parent_id can be null. I'd rather not fudge things so that parent_id=0 is some special id, as I'd have to add dummy values to the table to ensure the foreign key constraints are satisfied, and it also changes the application logic.
Is there any way to index organise a table by possible null value columns?
Solution from question asker:
I found I could get the same benefits from index organisation just by adding the queried columns to the end of the parent_id index, i.e. instead of:
create index foo_idx on foo_tab(parent_id);
I do:
create index foo_idx on foo_tab(parent_id, col1, col2, col3);
Where col1, col2, col3 etc are frequently accessed columns.
I've only done this with indexes which are used to return multiple rows which benefit from the ordering and hence disk locality provided by the index, instead of having to jump around the table. Indexes which are generally used to return single rows I've left to reference the table, as there is only one row to read anyway so locality matters much less.
Like I mentioned, this is a mainly read table, and also space is not a huge concern, so I don't think the overhead to writes caused by these indexes is a big concern.
(I realise this won't index null parent_ids, but instead I've made another index on decode(parent_id, null, 1, null) which indexes nulls and only nulls).
I would try adding the index on the single column parent_id.
If all of the columns in your index are non-null, then this row does not appear in your index.
So for the parent_id = X you cite above, this should use the index. However, if you're doing parent_id is null, then it won't use the index, and you'll be getting the same performance as you have now. This sounds like behaviour that would suit you.
I have used this in the past to improve the performance of queries. It works particulalry well if the number of items in the index is small compared to the number of rows in the database. We had about 3% of our rows in this particular index, and it flew :-)
But, as always, you need to try it and measure the difference in performance. Your mileage may vary.