Does making a primary key in multiple columns generate indexes for all of them? - oracle

If I set a primary key in multiple columns in Oracle, do I also need to create the indexes if I need them?
I believe that when you set a primary key on one column, you have it indexed by it; is it the same with multiple column PKs?
Thanks

No, indexes will not be created for the individual fields.
If you have a composit key FieldA, FieldB, FieldC and you
select * from MyTable where FieldA = :a
or
select * from MyTable where FieldA = :a and FieldB = :b
Then it will use this index (because it they are the first two fields in the key)
If you have
select * from MyTable where FieldB = :b and FieldC = :c
Where you are using parts of the index, but not the full index, the index will be used less efficiently through an index skip scan, full index scan, or fast full index scan.
(Thanks to David Aldridge for the correction)

If you create a primary key on columns (A, B, C) then Oracle will by default create a unique index on (A, B. C). You can tell Oracle to use a different (not necessarily unique) existing index like this:
alter table mytable add constraint mytable_pk
primary key (a, b, c)
using index mytable_index;

You will get one index across multiple columns, which is not the same as having an index on each column.

Primary key implies creating a composite unique index on primary key columns.
You can use a special access path called INDEX SKIP SCAN to use this index with predicates that do not include the first indexed column:
SQL> CREATE TABLE t_multiple (mul_first INTEGER NOT NULL, mul_second INTEGER NOT NULL, mul_data VARCHAR2(200))
2 /
Table created
SQL> ALTER TABLE t_multiple ADD CONSTRAINT pk_mul_first_second PRIMARY KEY (mul_first, mul_second)
2 /
Table altered
SELECT /*+ INDEX_SS (m pk_mul_first_second) */
*
FROM t_multiple m
WHERE mul_second = :test
SELECT STATEMENT, GOAL = ALL_ROWS
TABLE ACCESS BY INDEX ROWID SCOTT T_MULTIPLE
INDEX SKIP SCAN SCOTT PK_MUL_FIRST_SECOND

A primary key is only one (unique) index, possibly containing multiple columns

For B select index will be used if column a have low cardinality only (e.g. a have only 2 values).
In general you could have guessed this answer if you imagined that columns not indexed separately, but indexed concatenation of columns (it's not completely true, but it works for first approximation).
So it's not a, b index it's more like a||b index.

You may need to set individual indexes on the columns depending on your primary key structure.
Composite primary keys and indexes will create indexes in the following manner. Say i have columns A, B, C and i a create the primary key on (A, B, C). This will result in the indexes
(A, B, C)
(A, B)
(A)
Oracle actually creates an index on any of the left most column groupings. So... If you want an index on just the column B you will have to create one for it as well as the primary key.
P.S. I know MySQL exibits this left most behaviour and i think SQL Server is also left most

In Oracle, that's not an accurate statement. It creates only 1 index on (A,B,C). Does not create (A,B) and (A) indexes.

Related

Indexes Definition Improvements - (Oracle DB)

I have the following table definition and want to improve indexes:
CREATE TABLE MATE (
GUID NUMBER(38,0),
SITE_KEY NUMBER(38,0),
LAST_NAME VARCHAR2(200),
FIRST_NAME VARCHAR2(200),
BOOKING_NUM VARCHAR2(200),
RELEASE_DATE DATE,
STATUS VARCHAR2(200), -- Contains 'ACTIVE', 'RELEASED', 'DELETED', 'EXCLUDED', 'INACTIVE' and NULL
CONSTRAINT SYS_C008630 CHECK ("GUID" IS NOT NULL),
CONSTRAINT SYS_C008631 CHECK ("SITE_KEY" IS NOT NULL),
CONSTRAINT SYS_C008632 PRIMARY KEY (GUID, SITE_KEY),
CONSTRAINT FK8100EDAADECFC243 FOREIGN KEY (SITE_KEY) REFERENCES SITES<KEY>()
);
CREATE UNIQUE INDEX SYS_C008632 ON MATE (GUID, SITE_KEY); -- This is the PK (1)
CREATE INDEX IDX_STATUS ON MATE (STATUS); -- (2)
CREATE INDEX IDX_SITE_KEY ON MATE (SITE_KEY); -- (3)
CREATE INDEX IDX_BOOKING_NUMBER ON MATE (BOOKING_NUM); -- (4)
CREATE INDEX IDX_FNAME ON MATE (FIRST_NAME); -- (5)
CREATE INDEX IDX_LNAME ON MATE (LAST_NAME); -- (6)
CREATE INDEX BRIAN2_IX ON MATE (SITE_KEY,BOOKING_NUM); -- (7)
CREATE INDEX IDX_SITE_STATUS ON MATE (SITE_KEY,STATUS); -- (8)
CREATE INDEX IDX_PIN_SITEKEY ON MATE (BOOKING_NUM,SITE_KEY); -- (9)
CREATE INDEX IDX_SITE_NAME_STATUS ON MATE (SITE_KEY,LAST_NAME,STATUS); -- (10)
CREATE UNIQUE INDEX IDX_GUID_SITE_BOOKING ON MATE (GUID, SITE_KEY, BOOKING_NUM); -- (11)
CREATE UNIQUE INDEX IXU_SITE_BOOKING_GUID ON MATE (SITE_KEY, BOOKING_NUM, RELEASE_DATE, GUID); -- (12)
Is logical to:
Drop Index (7) because is already defined in (9)?
Drop Index (3) because is the left most in (8) and (10)?
Drop index (4) because is the left most in (9)?
Drop index (12) because SITE_KEY, BOOKING_NUM, GUID is already as UNIQUE Index in (11)?
Any other improvement?
You can't optimize indexes only by looking at their definition. You need to know how the indexes are used before you remove them.
Your Indexes Are Not Necessarily Redundant
For items #1 and #3, there are rare cases where you want to have two indexes that only differ based on the column order. For example, with the below two queries, it helps to have an index with both columns so you can avoid reading from the table. And the two different leading columns work better for each query. Having only one index is usually good enough, but maybe these are critical queries that need to be thoroughly optimized.
SELECT A, B FROM TABLE1 WHERE A = 1;
SELECT A, B FROM TABLE1 WHERE B = 2;
For items #2 and #4, the single-column indexes may be optimized for filtering, whereas the multi-column indexes may be optimized for index fast full scans (where the index acts like a skinny version of the table). For example, with the below queries, the first one runs best with an index on only column A, because that index is smaller and will be faster to read and more likely to fit in your cache. But the second query works best if there is an index on (A,B,C). Having the single, larger index is usually good enough, but not always.
SELECT * FROM TABLE1 WHERE A = 1;
SELECT A, B, C FROM TABLE1;
Which Indexes Are Necessary?
To find out which indexes are necessary, you should use index usage tracking. Fully optimizing indexes is a long, difficult process. But if you've gathered a list of suspicious indexes, and they are not used by any SQL statements, then they're probably safe to drop.
--Check that index statsitics are collected.
select * from gv$index_usage_info;
--Check which indexes are used.
select * from dba_index_usage order by last_used desc;
--Find recent SQL statements that used the index.
select * from gv$sql_plan where object_owner = 'JHELLER' and object_name = 'TEST1_IDX';
--Find historical SQL statements that used the index.
select * from dba_hist_sql_plan where object_owner = 'JHELLER' and object_name = 'TEST1_IDX';

Define index for sparse column

I have a table with a columns 'A' and 'B'.
'A' is a column with 90% 'null' and 10% different values , and most of the time I query to have record with one or two of these different values.
and 'B' is a column with 90% value='1' and 10% different values and most of the time I query to have record with one or two of these different values.
In this table we have DML transaction most of the time.
now , I don't know define index on these columns is good? if yes which type of index?
In principle Bitmap Index would be the best in such situation. However, due to mulit-user environment they are not suitable - you would slow down your application significantly by table locks and perhaps get even dead-locks.
Maybe you can optimize your application by smart partitioning and usage of Partial Indexes (new feature in Oracle 12c)
CREATE TABLE statements below should be equivalent.
CREATE TABLE YOUR_TABLE (a INTEGER, b INTEGER, ... more COLUMNS)
PARTITION BY LIST (a) SUBPARTITION BY LIST (b) (
PARTITION part_a_NULL VALUES (NULL) (
SUBPARTITION part_a_NULL_b_1 VALUES (1) INDEXING OFF,
SUBPARTITION part_a_NULL_b_other VALUES (DEFAULT) INDEXING ON
),
PARTITION part_a_others VALUES (DEFAULT) (
SUBPARTITION part_a_others_b_1 VALUES (1) INDEXING OFF,
SUBPARTITION part_a_others_b_other VALUES (DEFAULT) INDEXING ON
)
);
CREATE TABLE YOUR_TABLE (a INTEGER, b INTEGER, ... more COLUMNS)
PARTITION BY LIST (a) SUBPARTITION BY LIST (b)
SUBPARTITION TEMPLATE (
SUBPARTITION b_1 VALUES (1) INDEXING OFF,
SUBPARTITION b_other VALUES (DEFAULT) INDEXING ON
)
(
PARTITION part_a_NULL VALUES (NULL),
PARTITION part_a_others VALUES (DEFAULT)
);
CREATE INDEX IND_A ON YOUR_TABLE (A) LOCAL INDEXING PARTIAL;
CREATE INDEX IND_B ON YOUR_TABLE (B) LOCAL INDEXING PARTIAL;
By this your index will consume only 10% of entire tablespace. If your WHERE condition is WHERE A IS NULL or WHERE B = 1 then Oracle optimizer would skip such indexes anyway.
Verify with this query
SELECT table_name, partition_name, subpartition_name, indexing
FROM USER_TAB_SUBPARTITIONS
WHERE table_name = 'YOUR_TABLE';
if INDEXING is used on desired subpartitions.
Update
I just see actually this is an overkill because NULL values on column A do not create any index entry anyway. So, it can be simplified to
CREATE TABLE YOUR_TABLE (a INTEGER, b INTEGER, ... more COLUMNS)
PARTITION BY LIST (b) (
PARTITION part_b_1 VALUES (1) INDEXING OFF,
PARTITION part_b_other VALUES (DEFAULT) INDEXING ON
);
For example, if you have index a_b_idx on A, B (in that order):
a) select ... from ... where A = ... will use index
b) select ... from ... where B = ... will not use index
On the other side, if you have index b_a_idx on B, A:
a) select ... from ... where A = ... will not use index
b) select ... from ... where B = ... will use index
Oracle can't use second column in index if it doesn't filter on first column, since in regular cases index is tree-like structure: column1->column2->column3->etc.
You need index on column A only or on columns A, B if you do queries like a).
You need index on column B only or on columns B, A if you do queries like b).
Oracle doesn't store all-null values in index, but it can store null value for A if B contains non-null value.
Sometimes it's more fruitful to read whole table into memory and ignore index. Optimizer can do it if possible result set is big and it goes for all records, since index-to-record transition costs more than simple records read.
Also sometimes it happens erroneously for tables without statistics, so you either need jobs with alter table ... compute statistics or oracle 11+ that can compute statistics like this without jobs.
Most of the times, another index is good thing for queries, but bad thing for updates/disk. Each index takes disk space and each update of record(s) makes updates to every index. So for heavily updated tables it's not good to have many indexes, but for frequently queried tables it's better to have indexes covering all common cases.
For most flat queries (without joins/subqueries/hierarchy) only 1 index is used, so having indexes for each column is generally just a waste of disk space. You need multicolumn index to optimize where A=... and B=...
As for index type, you probably need simple non-unique indexes.
Column A
Let assume that you create an index named _columnA_index_. In general, indexes in RDBMS would not include NULL values, which means there is no index entries in _columnA_index_ pointing to records having NULL values. Thus, the following query
Q1: select * from MyTable where A is null;
will result in a table scan instead ( or DBMS opts to use another index on another column if any)
However, since there is 10% of records having 'different values', the _columnA_index_ will of course help for queries, for example.
Q2: select * from MyTable where A = '123';
In the above example, if the query returns < 1% of the records, the _columnA_index_ is helpful. Depending on how selective the query is, the index greatly improves the performance. You can create an index that is suitable for datatype of column A.
Column B
Similarly, an index on B will not help
Q3: select * from MyTable where B = 1;
but it will help with different values
Q4: select * from MyTable where B = '456';
NULL values
So far, I answered that any index does not help with NULL values. Therefore, if you need to query Q1 most of the time, I suggest the following ideas
Make sure that your version of DBMS does support NULL values be included in indexes. For example Oracle 11g does but not versions before that.
Plan to create function-based index here, again with Oracle. But you can take the idea at least.
Redesign the logic of your application / your need to do querying on Null values. I prefer this approach.

Why does SQLite not use an index for queries on my many-to-many relation table?

It's been a while since I've written code, and I never used SQLite before, but many-to-many relationships used to be so fundamental, there must be a way to make them fast...
This is a abstracted version of my database:
CREATE TABLE a (_id INTEGER PRIMARY KEY, a1 TEXT NOT NULL);
CREATE TABLE b (_id INTEGER PRIMARY KEY, fk INTEGER NOT NULL REFERENCES a(_id));
CREATE TABLE d (_id INTEGER PRIMARY KEY, d1 TEXT NOT NULL);
CREATE TABLE c (_id INTEGER PRIMARY KEY, fk INTEGER NOT NULL REFERENCES d(_id));
CREATE TABLE b2c (fk_b NOT NULL REFERENCES b(_id), fk_c NOT NULL REFERENCES c(_id), CONSTRAINT PK_b2c_desc PRIMARY KEY (fk_b, fk_c DESC), CONSTRAINT PK_b2c_asc UNIQUE (fk_b, fk_c ASC));
CREATE INDEX a_a1 on a(a1);
CREATE INDEX a_id_and_a1 on a(_id, a1);
CREATE INDEX b_fk on b(fk);
CREATE INDEX b_id_and_fk on b(_id, fk);
CREATE INDEX c_id_and_fk on c(_id, fk);
CREATE INDEX c_fk on c(fk);
CREATE INDEX d_id_and_d1 on d(_id, d1);
CREATE INDEX d_d1 on d(d1);
I have put in any index i could think of, just to make sure (and more than is reasonable, but not a problem, since the data is read only). And yet on this query
SELECT count(*)
FROM a, b, b2c, c, d
WHERE a.a1 = "A"
AND a._id = b.fk
AND b._id = b2c.fk_b
AND c._id = b2c.fk_c
AND d._id = c.fk
AND d.d1 ="D";
the relation table b2c does not use any indexes:
0|0|2|SCAN TABLE b2c
0|1|1|SEARCH TABLE b USING INTEGER PRIMARY KEY (rowid=?)
0|2|0|SEARCH TABLE a USING INTEGER PRIMARY KEY (rowid=?)
0|3|3|SEARCH TABLE c USING INTEGER PRIMARY KEY (rowid=?)
0|4|4|SEARCH TABLE d USING INTEGER PRIMARY KEY (rowid=?)
The query is about two orders of magnitude to slow to be usable. Is there any way to make SQLite use an index on b2c?
Thanks!
In a nested loop join, the outermost table does not use an index for the join (because the database just goes through all rows anyway).
To be able to use an index for a join, the index and the other column must have the same affinity, which usually means that both columns must have the same type.
Change the types of the b2c columns to INTEGER.
If the lookups on a1 or d1 are very selective, using a or d as the outermost table might make sense, and would then allow to use an index for the filter.
Try running ANALYZE.
If that does not help, you can force the join order with CROSS JOIN or INDEXED BY.

Oracle: use index for searching null values

I've done some search but I prefer something like an hint or similar
http://www.dba-oracle.com/oracle_tips_null_idx.htm
http://www.oracloid.com/2006/05/using-index-for-is-null/
What about a functional index using NVL2, like;
CREATE TABLE foo (bar INTEGER);
INSERT INTO foo VALUES (1);
INSERT INTO foo VALUES (NULL);
CREATE INDEX baz ON foo (NVL2(bar,0,1));
and then;
DELETE plan_table;
EXPLAIN PLAN FOR SELECT * FROM foo WHERE NVL2(bar,0,1) = 1;
SELECT operation, object_name FROM plan_table;
should give you
OPERATION OBJECT_NAME
---------------- -----------
SELECT STATEMENT
TABLE ACCESS FOO
INDEX BAZ << yep
If you're asking, "How can I create an index that would allow it to be used when searching for NULL values on a particular field", my suggestion is to create an index on the field you're interested in PLUS the primary key field(s). Thus, if you've got a table called A_TABLE, with field VAL that you want to search for NULLs, and a primary key named PK, I'd create an index on (VAL, PK).
Share and enjoy.
I'm going to "answer" the non-question above.
The articles you link to are kinda right - Oracle's b-tree indexes will not capture when the leaf nodes are null. Take this example:
CREATE TABLE MYTABLE (
ID NUMBER(8) NOT NULL,
DAT VARCHAR2(100)
);
CREATE INDEX MYTABLE_IDX_1 ON MYTABLE (DAT);
/* Perform inserts into MYTABLE where some DAT are null */
SELECT COUNT(*) FROM MYTABLE WHERE DAT IS NULL;
The ending SELECT will not be able to use the index, because the leafs (right-most column) will not capture the nulls. Burleson's solution is stupid, because now you have to use a NVL in all your queries and have compromised the data in the tables. Gorbachev's method includes a known NOT NULL column for the leaves of the b-tree, but this expands the index for no reason. Maybe in his case the index made sense that way for tuning other queries, but if all you want to do is find the NULLs then the easiest solution is to make the leaf a constant.
CREATE INDEX MYTABLE_IDX_1 ON MYTABLE (DAT, 1);
Now, the leaves are all the constant (1), and by default the nulls will all be together (either at the top or bottom of the index, but it doesn't really matter as Oracle can use the index forwards or backwards). There is a slight storage penalty for that constant, but a single number is smaller than most other data fields in a typical table. Now the database can use the index when querying for nulls...if the optimizer finds that the best way to get the data.

Oracle: Insertion on an indexed table, avoiding duplicates. Looking for tips and advice

Im looking for the best solution (performance wise) to achieve this.
I have to insert records into a table, avoiding duplicates.
For example, take table A
Insert into A (
Select DISTINCT [FIELDS] from B,C,D..
WHERE (JOIN CONDITIONS ON B,C,D..)
AND
NOT EXISTS
(
SELECT * FROM A ATMP WHERE
ATMP.SOMEKEY = A.SOMEKEY
)
);
I have an index over A.SOMEKEY, just to optimize the NOT EXISTS query, but i realize that inserting on an indexed table will be a performance hit.
So I was thinking of duplicating Table A in a Global Temporary Table, where I would keep the index. Then, removing the index from Table A and executing the query, but modified
Insert into A (
Select DISTINCT [FIELDS] from B,C,D..
WHERE (JOIN CONDITIONS ON B,C,D..)
AND
NOT EXISTS
(
SELECT * FROM GLOBAL_TEMPORARY_TABLE_A ATMP WHERE
ATMP.SOMEKEY = A.SOMEKEY
)
);
This would solve the "inserting on an index table", but I would have to update the Global Temporary A with each insertion I make.
I'm kind of lost here,
Is there a better way to achieve this?
Thanks in advance,
if the column A.SOMEKEY is declared NOT NULL and if you insert a large amound of data, a NOT IN clause might be more efficient than your NOT EXISTS since it will be able to use a HASH ANTI-JOIN.
INSERT INTO A
(SELECT DISTINCT FIELDS
FROM B, C, D ..
WHERE (JOIN CONDITIONS ON B, C, D..)
AND [B].SOMEKEY NOT IN (SELECT SOMEKEY FROM A)
AND [B].SOMEKEY IS NOT NULL;
HASH ANTI-JOINS are brutally efficient with large data sets.
I don't think the temporary table is a good idea in that case because you will be in one of these two cases:
the temporary table is indexed on SOMEKEY, your point about inserting into an indexed table being therefore moot
the temporary table is unindexed and your anti-join will be inefficient
Which method is the most efficient will probably depends upon the volume of data.
How about having the index on the table A.
create table b (same structure as table a) with NOLOGGING
Insert /*+APPEND */ into b (
Select DISTINCT [FIELDS] from B,C,D..
WHERE (JOIN CONDITIONS ON B,C,D..)
AND
NOT EXISTS
(
SELECT * FROM A ATMP WHERE
ATMP.SOMEKEY = A.SOMEKEY
)
);
Then drop the index on A and INSERT INTO A SELECT * FROM B
You could make B a global temporary table, but make sure that the data is persistent for the session as dropping the index will implictly commit.

Resources