let's say I have a large table.
This table not need to be queried, I just want to save the data inside for a while.
I want to prevent duplicates rows in the table, so I want to add an unique
constraint (or PK) on the table.
But the auto-created unique index is realy unnecessary.
I don't need it, and it's just wasting space in disk and require a maintenance
(regardless of the long time to create it).
Is there a way to create an unique constraint without index (any index - unique or nonunique)?
Thank you.
No, you can't have a UNIQUE constraint in Oracle without a corresponding index. The index is created automatically when the constraint is added, and any attempt to drop the index results in the error
ORA-02429: cannot drop index used for enforcement of unique/primary key
Best of luck.
EDIT
But you say "Let's say I have a large table". So how many rows are we talking about here? Look, 1TB SSD's are under $100. Quad-core laptops are under $400. If you're trying to minimize storage use or CPU burn by writing a bunch of code with minimal applicability to "save money" or "save time" my suggestion is that you're wasting both time and money. I repeat - ONE TERABYTE of storage costs the same as ONE HOUR of programmer time. A BRAND SPANKING NEW COMPUTER costs the same as FOUR LOUSY HOURS of programmer time. You are far, far better off doing whatever you can to minimize CODING TIME, rather than the traditional optimization targets of CPU time or disk space. Thus, I submit that the UNIQUE index is the low cost solution.
But the auto-created unique index is really unnecessary.
In fact, UNIQUEness in an Oracle Database is enforced/guaranteed via an INDEX. That's why your primary key constraints come with a UNIQUE INDEX.
Per the Docs
UNIQUE Key Constraints and Indexes
Oracle enforces unique integrity constraints with indexes.
Maybe Index-Organized Tables is what you need ?.
But strictly the index organized table is the table stored in the structure of the index - one can say that there is the index alone without the table, while yor requirement is to have the table without the index, so this is the opposite :)
CREATE TABLE some_name
(
col1 NUMBER(10) NOT NULL,
col2 NUMBER(10) NOT NULL,
col3 VARCHAR2(50) NOT NULL,
col4 VARCHAR2(50) NOT NULL,
CONSTRAINT pk_locations PRIMARY KEY (col1, col2)
)
ORGANIZATION INDEX
Related
I would like to spare some tables in my database.
One table for example has a simple Primary-Key-ID column and a VARCHAR2 column.
The VARCHAR2 column has NO duplicate values, yet different unique IDs.
The PK column of this table is just referenced once as a foreign key in another table.
My thoughts are now to insert the values from the VARCHAR2 column into the the table which has held the primary key.
I could now remove the foreign key reference, delete the table and gain a new column with all the (duplicate) VARCHAR2 values. These I would like to compress in a unique/distinct way.
I have heard about index in the Oracle Database to compress column(s) but I am not quite sure which index I need or how to use them...
The underlying feature (and storage savings) should be about as the same as it was with the previous table of unique values and the foreign key reference.
Thank you for your help in advance!
Oracle basic compression allows us to compress tables. It comes with several distinct limitations, not the least of which is that it isn't suitable for OLTP databases. Direct path inserts, updates and deletes don't benefit. So you can't do what you want that way. If your organisation has sprung for the Advanced Compression licence then you have more options, but the compression still works on the table not an individual column.
I think you've confused things with index compression, which does operate on columns, as it allows us to compress the leading column(s) of a compound index. But it's worth applying only when there's a lot of repetition in those columns. If your index has a unique ID for the leading column than compression will actually increase the total amount of space taken. (Just one reason why compound indexes should be built with the least selective column first and the most selective column last.)
Your table is a classic key-value lookup table. So you could consider converting it into an index-organized table. You would save yourself a bit of space by maintaining only a specialized index instead of a table and its primary key index. Find out more
I have a test set up to write rows to a database.
Each transaction inserts 10,000 rows, no updates.
Each step takes a linear time longer then the last.
The first ten steps took the following amount of time in ms to perform a commit
568, 772, 942, 1247, 1717, 1906, 2268, 2797, 2922, 3816, 3945
By the time it reaches adding 10,00 rows to a table of 500,000 rows, it takes 37149 ms to commit!
I have no foreign key constraints.
I have found using WAL, improves performance (gives figures above), but still linear degradation
PRAGMA Synchronous=OFF has no effect
PRAGMA locking_mode=EXCLUSIVE has no effect
Ran with no additional indexes and additional indexes. Made a roughly constant time difference, so was still a linear degradation.
Some other settings I have
setAutocommit(false)
PRAGMA page_size = 4096
PRAGMA journal_size_limit = 104857600
PRAGMA count_changes = OFF
PRAGMA cache_size = 10000
Schema has Id INTEGER PRIMARY KEY ASC, insertion of which is incremental and generated by Sqlite
Full Schema as follows (I have run both with and without indexes, but have included)
create table if not exists [EventLog] (
Id INTEGER PRIMARY KEY ASC,
DocumentId TEXT NOT NULL,
Event TEXT NOT NULL,
Content TEXT NOT NULL,
TransactionId TEXT NOT NULL,
Date INTEGER NOT NULL,
User TEXT NOT NULL)
create index if not exists DocumentId ON EventLog (DocumentId)
create index if not exists TransactionId ON EventLog (TransactionId)
create index if not exists Date ON EventLog (Date)
This is using sqlite-jdbc-3.7.2 running in a windows environment
SQLite tables and indexes are internally organized as B-Trees. In tables, the Rowid is the sorting key. (Your INTEGER PRIMARY KEY is the Rowid.)
If your inserted IDs are not larger than the largest ID already in the table, then the records are not appended, but inserted somewhere in the middle of the tree. When inserting enough records in one transaction, and if the distribution of IDs is random, this means that almost every page in the database must be rewritten.
To avoid this,
insert the IDs in increasing order; or
insert the IDs as NULL so that SQLite chooses the next value; or
prevent SQLite from using your ID field a Rowid by declaring it as INTEGER UNIQUE (or just INTEGER if you don't need the extra check/index), thus making the table ordering independent of your ID.
In the case of indexes, inserting an indexed field with a random distribution requires that the index is updated at a random position. Like with tables, when inserting enough records in one transaction, this means that almost every page in the index must be rewritten.
When you're loading large amounts of data, it is recommended to do this without any indexes and to recreate them afterwards. (Unlike some other databases, SQLite has no function to temporarily disable indexes; just drop them.)
FYI, although I haven't limited the structure in terms of the content of the key, in 99.999% of cases, it will be a guid. So to resolve the performance issue I just wrote an algorithm for generating sequential guids using a time based value for the first 8 hex digits. This worked very well, even if blocks of guids are generated using early time values.
what is use-case of IOT (Index Organized Table) ?
Let say I have table like
id
Name
surname
i know the IOT but bit confuse about the use case of IOT
Your three columns don't make a good use case.
IOT are most useful when you often access many consecutive rows from a table. Then you define a primary key such that the required order is represented.
A good example could be time series data such as historical stock prices. In order to draw a chart of the stock price of a share, many rows are read with consecutive dates.
So the primary key would be stock ticker (or security ID) and the date. The additional columns could be the last price and the volume.
A regular table - even with an index on ticker and date - would be much slower because the actual rows would be distributed over the whole disk. This is because you cannot influence the order of the rows and because data is inserted day by day (and not ticker by ticker).
In an index-organized table, the data for the same ticker ends up on a few disk pages, and the required disk pages can be easily found.
Setup of the table:
CREATE TABLE MARKET_DATA
(
TICKER VARCHAR2(20 BYTE) NOT NULL ENABLE,
P_DATE DATE NOT NULL ENABLE,
LAST_PRICE NUMBER,
VOLUME NUMBER,
CONSTRAINT MARKET_DATA_PK PRIMARY KEY (TICKER, P_DATE) ENABLE
)
ORGANIZATION INDEX;
Typical query:
SELECT TICKER, P_DATE, LAST_PRICE, VOLUME
FROM MARKET_DATA
WHERE TICKER = 'MSFT'
AND P_DATE BETWEEN SYSDATE - 1825 AND SYSDATE
ORDER BY P_DATE;
Think of index organized tables as indexes. We all know the point of an index: to improve access speeds to particular rows of data. This is a performance optimisation of trick of building compound indexes on sub-sets of columns which can be used to satisfy commonly-run queries. If an index can completely satisy the columns in a query's projection the optimizer knows it doesn't have to read from the table at all.
IOTs are just this approach taken to its logical confusion: buidl the index and throw away the underlying table.
There are two criteria for deciding whether to implement a table as an IOT:
It should consists of a primary key (one or more columns) and at most one other column. (okay, perhaps two other columns at a stretch, but it's an warning flag).
The only access route for the table is the primary key (or its leading columns).
That second point is the one which catches most people out, and is the main reason why the use cases for IOT are pretty rare. Oracle don't recommend building other indexes on an IOT, so that means any access which doesn't drive from the primary key will be a Full Table Scan. That might not matter if the table is small and we don't need to access it through some other path very often, but it's a killer for most application tables.
It is also likely that a candidate table will have a relatively small number of rows, and is likely to be fairly static. But this is not a hard'n'fast rule; certainly a huge, volatile table which matched the two criteria listed above could still be considered for implementations as an IOT.
So what makes a good candidate dor index organization? Reference data. Most code lookup tables are like something this:
code number not null primary key
description not null varchar2(30)
Almost always we're only interested in getting the description for a given code. So building it as an IOT will save space and reduce the access time to get the description.
I have a table which is basically a tree structure with a column parent_id and id.
parent_id is null for root nodes.
There is also a self referential foreign key, so that every parent_id has a corresponding id.
This table is mainly read-only with mostly infrequent batch updates.
One of the most common queries from the application which accesses this table is select ... where parent_id = X. I thought this might be faster if this table was index organised on parent_id.
However, I'm not sure how to index organise this table if parent_id can be null. I'd rather not fudge things so that parent_id=0 is some special id, as I'd have to add dummy values to the table to ensure the foreign key constraints are satisfied, and it also changes the application logic.
Is there any way to index organise a table by possible null value columns?
Solution from question asker:
I found I could get the same benefits from index organisation just by adding the queried columns to the end of the parent_id index, i.e. instead of:
create index foo_idx on foo_tab(parent_id);
I do:
create index foo_idx on foo_tab(parent_id, col1, col2, col3);
Where col1, col2, col3 etc are frequently accessed columns.
I've only done this with indexes which are used to return multiple rows which benefit from the ordering and hence disk locality provided by the index, instead of having to jump around the table. Indexes which are generally used to return single rows I've left to reference the table, as there is only one row to read anyway so locality matters much less.
Like I mentioned, this is a mainly read table, and also space is not a huge concern, so I don't think the overhead to writes caused by these indexes is a big concern.
(I realise this won't index null parent_ids, but instead I've made another index on decode(parent_id, null, 1, null) which indexes nulls and only nulls).
I would try adding the index on the single column parent_id.
If all of the columns in your index are non-null, then this row does not appear in your index.
So for the parent_id = X you cite above, this should use the index. However, if you're doing parent_id is null, then it won't use the index, and you'll be getting the same performance as you have now. This sounds like behaviour that would suit you.
I have used this in the past to improve the performance of queries. It works particulalry well if the number of items in the index is small compared to the number of rows in the database. We had about 3% of our rows in this particular index, and it flew :-)
But, as always, you need to try it and measure the difference in performance. Your mileage may vary.
Changing and finding stuff in a database containing a few dozen tables with around half a million rows in the big ones I'm running into timeouts quite often.
Some of these timeouts I don't understand. For example I got this table:
CREATE TABLE dbo.[VPI_APO]
(
[Key] bigint IDENTITY(1,1) NOT NULL CONSTRAINT [PK_VPI_APO] PRIMARY KEY,
[PZN] nvarchar(7) NOT NULL,
[Key_INB] nvarchar(5) NOT NULL,
) ON [PRIMARY]
GO
ALTER TABLE dbo.[VPI_APO] ADD CONSTRAINT [IX_VPI_APOKey_INB] UNIQUE NONCLUSTERED
(
[PZN],
[Key_INB]
) ON [PRIMARY]
GO
I often get timeouts when I search an item in this table like this (during inserting high volumes of items):
SELECT [Key] FROM dbo.[VPI_APO] WHERE ([PZN] = #Search1) AND ([Key_INB] = #Search2)
These timeouts when searching on the unique constraints happen quite often. I expected unique constraints to have the same benefits as indices, was I mistaken? Do I need an index on these fields, too?
Or will I have to search differently to benefit of the constraint?
I'm using SQL Server 2008 R2.
A unique constraint creates an index. Based on the query and the constraint defined you should be in a covering situation, meaning the index provides everything the query needs without a need to back to the cluster or the heap to retrieve data. I suspect that your index is not sufficiently selective or that your statistics are out of date. First try updating the statistics with sp_updatestats. If that doesn't change the behavior, try using UPDATE STATISTICS VPI_APO WITH FULL SCAN. If neither of those work, you need to examine the selectivity of the index using DBCC SHOW_STATISTICS.
My first thought is parameter sniffing.
If you're on SQL Server 2008, try "OPTIMIZE FOR UNKNOWN" rather than parameter masking
2nd thought is change the unique constrant to an index and INCLUDE the Key column explicitly. Internally they are the same, but as an index you have some more flexibility (eg filter, include etc)