How to generate PHYSICALLY sorted GUID as primary key in Oracle? - oracle

I'm trying to generate a sorted GUID as a primary key in Oracle. In SQL Server I could use one of the following to sort the rows physically
By clustered primary key as a unique identifier.
By NEWSEQUENTIALID.
I have searched for an Oracle equivalent but failed to find a solution. I know about Is there a way to create an auto-incrementing Guid Primary Key in an Oracle database?, but there's no indication whether SYS_GUID() is sorted.
How could I create a sequential primary key in Oracle?

If you want to create a GUID then SYS_GUID() is what you should be using, you can create this in a table as per the linked question. It's unclear from the documentation whether SYS_GUID() is incrementing. It might be but that's not really a statement that imparts trust.
The next part of your question (and some comments) keeps asking about clustered primary keys. This concept does not exist in the same way in Oracle as it does in SQL Server and Sybase. Oracle does have indexed organized tables (IOT) ...
... a table stored in a variation of a B-tree index structure... rows
are stored in an index defined on the primary key for the table. Each
index entry in the B-tree also stores the non-key column values. Thus,
the index is the data, and the data is the index.
There are plenty of uses for IOTs but it's worth bearing in mind you're altering the physical structure of the database on the disk for "performance reasons". You're doing the ultimate of all premature optimizations using something that has both negative and positive aspects. Read the documentation and be sure that this is what you want to do.
I would generally use an IOT only when you don't care about DML performance but when you do a lot of range scans, or you need to order by the primary key. You create an IOT in the same way as you would an ordinary table, but because everything you want is now part of the table everything goes in your table definition:
create table test_table (
id raw(32) default sys_guid()
, a_col varchar2(50)
, constraint pk_my_iot primary key (id)
) organization index;
It's worth noting that even with an IOT you must use an explicit ORDER BY in order to guarantee returned order. However, because of the way this is stored Oracle can table a few short cuts:
select *
from ( select *
from test_table
order by id )
where rownum < 2
SQL Fiddle.
As with everything, test, don't assume that this is the structure you want.

Oracle has as SYS_GUID() function, which generates a 16-byte RAW datatype. But, I'm not sure what you mean by "sorted GUID". Can you elaborate?
Do you mean you need each generated GUID to sort "after" the previously generated GUID? I looked at the SYS_GUID() function, and it seems to generate GUIDs in sorted order, but looking at the documentation, I don't see anything that says that is guaranteed.
If I understand your question correctly, I'm not sure it's possible.
You may be able to use SYS_GUID() and prepend a sequence, to get your desired sort order?
Can you explain more about your use case?
Adding the following in response to comment:
Ok, now I think I understand. What I think you want, is something called an IOT, or Index Organized Table, in Oracle. It's a table that has an index strucure, and all data is clustered, or grouped by the primary key. More information is available here:
http://docs.oracle.com/cd/E16655_01/server.121/e17633/indexiot.htm#CNCPT721
I think that should do what you want.

Related

Do I need a primary key with an Oracle identity column?

I am using Oracle 12c and I have an IDENTITY column set as GENERATED ALWAYS.
CREATE TABLE Customers
(
id NUMBER GENERATED ALWAYS AS IDENTITY,
customerName VARCHAR2(30) NULL,
CONSTRAINT "CUSTOMER_ID_PK" PRIMARY KEY ("ID")
);
Since the ID is automatically from a sequence it will be always unique.
Do I need a PK on the ID column, and if yes, will it impact the performance?
Would an index produce the same result with a better performance on INSERT?
No, you don't need a primary key necessarily, but you should always provide the optimiser as much information about your data as possible - including a unique constraint whenever possible.
In the case of a surrogate key (like your ID), it's almost always appropriate to declare it as a Primary Key, since it's the most likely candidate for referential constraints.
You could use an ordinary index on ID, and performance of lookups will be comparable depending on data volume - but there is virtually no good reason in this case to use a non-unique index instead of a unique index - and there is no good reason in this case to avoid the constraint which will require an index anyway.
Yes, you always should have a Uniqueness constraint (with rare exceptions) on an ID column if indeed it is (and should be) unique, regardless of the method by which it is populated - whether the value is provided by your application code or via an IDENTITY column.

Fastest way to SELECT a row from a table in a database (Microsoft SQL server)

I have a huge table with one int PRIMARY KEY IDENTITY column.
I guess making the SELECT query using that primary key is the fastest way for the database to find the row in the table isn't it?
If that is true i still have a question.
Is that query as fast as a call to a dictionary by key or the database still has to read all the rows from the beginning (the Primary Key column) till it finds the row itself?
Thanks in advance ^^
Using primary key is obviously the fastest way to access a particular row.
If you want to understand how it works, you have to understand how index works.
In general it works like that :
Let's say you have a table t1(col1,col2...col10) and you have an index on col1.
Index on col1 means that you have some data structure which contains pairs (col1, rec_id)
and rec_id allows direct access to row with appropriate col1.
The data structure is ordered by col1 and therefore allows efficient searching by col1.
I think searching in dictionary works per dictionary search algorithm which should be more like binary search kind.
When you declare a column as Primary key in table, then that column is indexed, hence it should be working based on hashing principle, so searching is definitely NOT row by row as you mentioned.
Finally, yes it is the common and fast way, but you should be selective about the number of columns and rows you need in your sql query. Avoid fetching large number of rows per select call.

oracle- index organized table

what is use-case of IOT (Index Organized Table) ?
Let say I have table like
id
Name
surname
i know the IOT but bit confuse about the use case of IOT
Your three columns don't make a good use case.
IOT are most useful when you often access many consecutive rows from a table. Then you define a primary key such that the required order is represented.
A good example could be time series data such as historical stock prices. In order to draw a chart of the stock price of a share, many rows are read with consecutive dates.
So the primary key would be stock ticker (or security ID) and the date. The additional columns could be the last price and the volume.
A regular table - even with an index on ticker and date - would be much slower because the actual rows would be distributed over the whole disk. This is because you cannot influence the order of the rows and because data is inserted day by day (and not ticker by ticker).
In an index-organized table, the data for the same ticker ends up on a few disk pages, and the required disk pages can be easily found.
Setup of the table:
CREATE TABLE MARKET_DATA
(
TICKER VARCHAR2(20 BYTE) NOT NULL ENABLE,
P_DATE DATE NOT NULL ENABLE,
LAST_PRICE NUMBER,
VOLUME NUMBER,
CONSTRAINT MARKET_DATA_PK PRIMARY KEY (TICKER, P_DATE) ENABLE
)
ORGANIZATION INDEX;
Typical query:
SELECT TICKER, P_DATE, LAST_PRICE, VOLUME
FROM MARKET_DATA
WHERE TICKER = 'MSFT'
AND P_DATE BETWEEN SYSDATE - 1825 AND SYSDATE
ORDER BY P_DATE;
Think of index organized tables as indexes. We all know the point of an index: to improve access speeds to particular rows of data. This is a performance optimisation of trick of building compound indexes on sub-sets of columns which can be used to satisfy commonly-run queries. If an index can completely satisy the columns in a query's projection the optimizer knows it doesn't have to read from the table at all.
IOTs are just this approach taken to its logical confusion: buidl the index and throw away the underlying table.
There are two criteria for deciding whether to implement a table as an IOT:
It should consists of a primary key (one or more columns) and at most one other column. (okay, perhaps two other columns at a stretch, but it's an warning flag).
The only access route for the table is the primary key (or its leading columns).
That second point is the one which catches most people out, and is the main reason why the use cases for IOT are pretty rare. Oracle don't recommend building other indexes on an IOT, so that means any access which doesn't drive from the primary key will be a Full Table Scan. That might not matter if the table is small and we don't need to access it through some other path very often, but it's a killer for most application tables.
It is also likely that a candidate table will have a relatively small number of rows, and is likely to be fairly static. But this is not a hard'n'fast rule; certainly a huge, volatile table which matched the two criteria listed above could still be considered for implementations as an IOT.
So what makes a good candidate dor index organization? Reference data. Most code lookup tables are like something this:
code number not null primary key
description not null varchar2(30)
Almost always we're only interested in getting the description for a given code. So building it as an IOT will save space and reduce the access time to get the description.

Is an index clustered or unclustered in Oracle?

How can I determine if an Oracle index is clustered or unclustered?
I've done
select FIELD from TABLE where rownum <100
where FIELD is the field on which is built the index. I have ordered tuples, but the result is wrong because the index is unclustered.
By default all indexes in Oracle are unclustered. The only clustered indexes in Oracle are the Index-Organized tables (IOT) primary key indexes.
You can determine if a table is an IOT by looking at the IOT_TYPE column in the ALL_TABLES view (its primary key could be determined by querying the ALL_CONSTRAINTS and ALL_CONS_COLUMNS views).
Here are some reasons why your query might return ordered rows:
Your table is index-organized and FIELD is the leading part of its primary key.
Your table is heap-organized but the rows are by chance ordered by FIELD, this happens sometimes on an incrementing identity column.
Case 2 will return sorted rows only by chance. The order of the inserts is not guaranteed, furthermore Oracle is free to reuse old blocks if some happen to have available space in the future, disrupting the fragile ordering.
Case 1 will most of the time return ordered rows, however you shouldn't rely on it since the order of the rows returned depends upon the algorithm of the access path which may change in the future (or if you change DB parameter, especially parallelism).
In both case if you want ordered rows you should supply an ORDER BY clause:
SELECT field
FROM (SELECT field
FROM TABLE
ORDER BY field)
WHERE rownum <= 100;
There is no concept of a "clustered index" in Oracle as in SQL Server and Sybase. There is an Index-Organized Table, which is similar but not the same.
"Clustered" indices, as implemented in Sybase, MS SQL Server and possibly others, where rows are physically stored in the order of the indexed column(s) don't exist as such in Oracle. "Cluster" has a different meaning in Oracle, relating, I believe, to the way blocks and tables are organized.
Oracle does have "Index Organized Tables", which are physically equivalent, but they're used much less frequently because the query optimizer works differently.
The closest I can get to an answer to the identification question is to try something like this:
SELECT IOT_TYPE FROM user_tables
WHERE table_name = '<your table name>'
My 10g instance reports IOT or null accordingly.
Index Organized Tables have to be organized on the primary key. Where the primary key is a sequence generated value this is often useless or even counter-productive (because simultaneous inserts get into conflict for the same block).
Single table clusters can be used to group data with the same column value in the same database block(s). But they are not ordered.

SQL Timeout and indices

Changing and finding stuff in a database containing a few dozen tables with around half a million rows in the big ones I'm running into timeouts quite often.
Some of these timeouts I don't understand. For example I got this table:
CREATE TABLE dbo.[VPI_APO]
(
[Key] bigint IDENTITY(1,1) NOT NULL CONSTRAINT [PK_VPI_APO] PRIMARY KEY,
[PZN] nvarchar(7) NOT NULL,
[Key_INB] nvarchar(5) NOT NULL,
) ON [PRIMARY]
GO
ALTER TABLE dbo.[VPI_APO] ADD CONSTRAINT [IX_VPI_APOKey_INB] UNIQUE NONCLUSTERED
(
[PZN],
[Key_INB]
) ON [PRIMARY]
GO
I often get timeouts when I search an item in this table like this (during inserting high volumes of items):
SELECT [Key] FROM dbo.[VPI_APO] WHERE ([PZN] = #Search1) AND ([Key_INB] = #Search2)
These timeouts when searching on the unique constraints happen quite often. I expected unique constraints to have the same benefits as indices, was I mistaken? Do I need an index on these fields, too?
Or will I have to search differently to benefit of the constraint?
I'm using SQL Server 2008 R2.
A unique constraint creates an index. Based on the query and the constraint defined you should be in a covering situation, meaning the index provides everything the query needs without a need to back to the cluster or the heap to retrieve data. I suspect that your index is not sufficiently selective or that your statistics are out of date. First try updating the statistics with sp_updatestats. If that doesn't change the behavior, try using UPDATE STATISTICS VPI_APO WITH FULL SCAN. If neither of those work, you need to examine the selectivity of the index using DBCC SHOW_STATISTICS.
My first thought is parameter sniffing.
If you're on SQL Server 2008, try "OPTIMIZE FOR UNKNOWN" rather than parameter masking
2nd thought is change the unique constrant to an index and INCLUDE the Key column explicitly. Internally they are the same, but as an index you have some more flexibility (eg filter, include etc)

Resources