SQL Timeout and indices - performance

Changing and finding stuff in a database containing a few dozen tables with around half a million rows in the big ones I'm running into timeouts quite often.
Some of these timeouts I don't understand. For example I got this table:
CREATE TABLE dbo.[VPI_APO]
(
[Key] bigint IDENTITY(1,1) NOT NULL CONSTRAINT [PK_VPI_APO] PRIMARY KEY,
[PZN] nvarchar(7) NOT NULL,
[Key_INB] nvarchar(5) NOT NULL,
) ON [PRIMARY]
GO
ALTER TABLE dbo.[VPI_APO] ADD CONSTRAINT [IX_VPI_APOKey_INB] UNIQUE NONCLUSTERED
(
[PZN],
[Key_INB]
) ON [PRIMARY]
GO
I often get timeouts when I search an item in this table like this (during inserting high volumes of items):
SELECT [Key] FROM dbo.[VPI_APO] WHERE ([PZN] = #Search1) AND ([Key_INB] = #Search2)
These timeouts when searching on the unique constraints happen quite often. I expected unique constraints to have the same benefits as indices, was I mistaken? Do I need an index on these fields, too?
Or will I have to search differently to benefit of the constraint?
I'm using SQL Server 2008 R2.

A unique constraint creates an index. Based on the query and the constraint defined you should be in a covering situation, meaning the index provides everything the query needs without a need to back to the cluster or the heap to retrieve data. I suspect that your index is not sufficiently selective or that your statistics are out of date. First try updating the statistics with sp_updatestats. If that doesn't change the behavior, try using UPDATE STATISTICS VPI_APO WITH FULL SCAN. If neither of those work, you need to examine the selectivity of the index using DBCC SHOW_STATISTICS.

My first thought is parameter sniffing.
If you're on SQL Server 2008, try "OPTIMIZE FOR UNKNOWN" rather than parameter masking
2nd thought is change the unique constrant to an index and INCLUDE the Key column explicitly. Internally they are the same, but as an index you have some more flexibility (eg filter, include etc)

Related

Postgres (AWS Aurora) is not enforcing unique index/constraint

We are using Postgres for our production database, it's technically an Amazon AWS Aurora database using the 10.11 engine version. It doesn't seem to be under any unreasonable load (100-150 concurrent connections, CPU always under 10%, about 50% of the memory used, spikes to 300 write IOPS / 1500 read IOPS per second).
We like to ensure really good data consistency, so we make extensive use of foreign keys, triggers to validate data as it's being inserted/updated and also lots of unique constraints.
Most of the writes originate from simple REST API requests, which result in very standard insert and update queries. However, in some cases we also use triggers and functions to handle more complicated logic. For example, an update to one table will result in some fairly complicated cascading updates to other tables.
All queries are always wrapped in transactions, and for the most part we do not make use of explicit locking.
So what's wrong?
We have many (dozens of rows, across dozens of tables) instances where data exists in the database which does not conform to our unique constraints.
Sometimes the created_at and updated_at timestamps for the offending rows are identical, other times they are very similar (within half a second). This leads me to believe that this is being caused by a race condition.
We're not certain, but are fairly confident that the thing in common with these records is that the writes either triggered a function (the record was written from a simple insert or update, and caused several other tables to be updated) or that the write came from a function (a different record was written from a simple insert or update, which triggered a function that wrote the offending data).
From what I have been able to research, unique constraints/indexes are incredibly reliable and "just work". Is this true? If so, then why might this be happening?
Here is an example of some offending data, I've had to black out some of it, but I promise you the values in the user_id field are identical. As you will see below, there is a unique index across user_id, position, and undeleted. So the presence of this data should be impossible.
Here is an export of table structure:
-- Table Definition ----------------------------------------------
CREATE TABLE guides.preferences (
id uuid DEFAULT gen_random_uuid() PRIMARY KEY,
user_id uuid NOT NULL REFERENCES users.users(id),
guide_id uuid NOT NULL REFERENCES users.users(id),
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL,
undeleted boolean DEFAULT true,
deleted_at timestamp without time zone,
position integer NOT NULL CHECK ("position" >= 0),
completed_meetings_count integer NOT NULL DEFAULT 0,
CONSTRAINT must_concurrently_set_deleted_at_and_undeleted CHECK (undeleted IS TRUE AND deleted_at IS NULL OR undeleted IS NULL AND deleted_at IS NOT NULL),
CONSTRAINT preferences_guide_id_user_id_undeleted_unique UNIQUE (guide_id, user_id, undeleted),
CONSTRAINT preferences_user_id_position_undeleted_unique UNIQUE (user_id, position, undeleted) DEFERRABLE INITIALLY DEFERRED
);
COMMENT ON COLUMN guides.preferences.undeleted IS 'Set simultaneously with deleted_at to flag this as deleted or undeleted';
COMMENT ON COLUMN guides.preferences.deleted_at IS 'Set simultaneously with deleted_at to flag this as deleted or undeleted';
-- Indices -------------------------------------------------------
CREATE UNIQUE INDEX preferences_pkey ON guides.preferences(id uuid_ops);
CREATE UNIQUE INDEX preferences_user_id_position_undeleted_unique ON guides.preferences(user_id uuid_ops,position int4_ops,undeleted bool_ops);
CREATE INDEX index_preferences_on_user_id_and_guide_id ON guides.preferences(user_id uuid_ops,guide_id uuid_ops);
CREATE UNIQUE INDEX preferences_guide_id_user_id_undeleted_unique ON guides.preferences(guide_id uuid_ops,user_id uuid_ops,undeleted bool_ops);
We're really stumped by this, and hope that someone might be able to help us. Thank you!
I found it the reason! We have been building a lot of new functionality over the last few months, and have been running lots of migrations to change schema and update data. Because of all the triggers and functions in our database, it often makes sense to temporarily disable triggers. We do this with “set session_replication_role = ‘replica’;”.
Turns out that this also disables all deferrable constraints, because deferrable constraints and foreign keys are trigger based. As you can see from the schema in my question, the unique constraint in question is set as deferrable.
Mystery solved!

Unique constraint without index

let's say I have a large table.
This table not need to be queried, I just want to save the data inside for a while.
I want to prevent duplicates rows in the table, so I want to add an unique
constraint (or PK) on the table.
But the auto-created unique index is realy unnecessary.
I don't need it, and it's just wasting space in disk and require a maintenance
(regardless of the long time to create it).
Is there a way to create an unique constraint without index (any index - unique or nonunique)?
Thank you.
No, you can't have a UNIQUE constraint in Oracle without a corresponding index. The index is created automatically when the constraint is added, and any attempt to drop the index results in the error
ORA-02429: cannot drop index used for enforcement of unique/primary key
Best of luck.
EDIT
But you say "Let's say I have a large table". So how many rows are we talking about here? Look, 1TB SSD's are under $100. Quad-core laptops are under $400. If you're trying to minimize storage use or CPU burn by writing a bunch of code with minimal applicability to "save money" or "save time" my suggestion is that you're wasting both time and money. I repeat - ONE TERABYTE of storage costs the same as ONE HOUR of programmer time. A BRAND SPANKING NEW COMPUTER costs the same as FOUR LOUSY HOURS of programmer time. You are far, far better off doing whatever you can to minimize CODING TIME, rather than the traditional optimization targets of CPU time or disk space. Thus, I submit that the UNIQUE index is the low cost solution.
But the auto-created unique index is really unnecessary.
In fact, UNIQUEness in an Oracle Database is enforced/guaranteed via an INDEX. That's why your primary key constraints come with a UNIQUE INDEX.
Per the Docs
UNIQUE Key Constraints and Indexes
Oracle enforces unique integrity constraints with indexes.
Maybe Index-Organized Tables is what you need ?.
But strictly the index organized table is the table stored in the structure of the index - one can say that there is the index alone without the table, while yor requirement is to have the table without the index, so this is the opposite :)
CREATE TABLE some_name
(
col1 NUMBER(10) NOT NULL,
col2 NUMBER(10) NOT NULL,
col3 VARCHAR2(50) NOT NULL,
col4 VARCHAR2(50) NOT NULL,
CONSTRAINT pk_locations PRIMARY KEY (col1, col2)
)
ORGANIZATION INDEX

How to generate PHYSICALLY sorted GUID as primary key in Oracle?

I'm trying to generate a sorted GUID as a primary key in Oracle. In SQL Server I could use one of the following to sort the rows physically
By clustered primary key as a unique identifier.
By NEWSEQUENTIALID.
I have searched for an Oracle equivalent but failed to find a solution. I know about Is there a way to create an auto-incrementing Guid Primary Key in an Oracle database?, but there's no indication whether SYS_GUID() is sorted.
How could I create a sequential primary key in Oracle?
If you want to create a GUID then SYS_GUID() is what you should be using, you can create this in a table as per the linked question. It's unclear from the documentation whether SYS_GUID() is incrementing. It might be but that's not really a statement that imparts trust.
The next part of your question (and some comments) keeps asking about clustered primary keys. This concept does not exist in the same way in Oracle as it does in SQL Server and Sybase. Oracle does have indexed organized tables (IOT) ...
... a table stored in a variation of a B-tree index structure... rows
are stored in an index defined on the primary key for the table. Each
index entry in the B-tree also stores the non-key column values. Thus,
the index is the data, and the data is the index.
There are plenty of uses for IOTs but it's worth bearing in mind you're altering the physical structure of the database on the disk for "performance reasons". You're doing the ultimate of all premature optimizations using something that has both negative and positive aspects. Read the documentation and be sure that this is what you want to do.
I would generally use an IOT only when you don't care about DML performance but when you do a lot of range scans, or you need to order by the primary key. You create an IOT in the same way as you would an ordinary table, but because everything you want is now part of the table everything goes in your table definition:
create table test_table (
id raw(32) default sys_guid()
, a_col varchar2(50)
, constraint pk_my_iot primary key (id)
) organization index;
It's worth noting that even with an IOT you must use an explicit ORDER BY in order to guarantee returned order. However, because of the way this is stored Oracle can table a few short cuts:
select *
from ( select *
from test_table
order by id )
where rownum < 2
SQL Fiddle.
As with everything, test, don't assume that this is the structure you want.
Oracle has as SYS_GUID() function, which generates a 16-byte RAW datatype. But, I'm not sure what you mean by "sorted GUID". Can you elaborate?
Do you mean you need each generated GUID to sort "after" the previously generated GUID? I looked at the SYS_GUID() function, and it seems to generate GUIDs in sorted order, but looking at the documentation, I don't see anything that says that is guaranteed.
If I understand your question correctly, I'm not sure it's possible.
You may be able to use SYS_GUID() and prepend a sequence, to get your desired sort order?
Can you explain more about your use case?
Adding the following in response to comment:
Ok, now I think I understand. What I think you want, is something called an IOT, or Index Organized Table, in Oracle. It's a table that has an index strucure, and all data is clustered, or grouped by the primary key. More information is available here:
http://docs.oracle.com/cd/E16655_01/server.121/e17633/indexiot.htm#CNCPT721
I think that should do what you want.

MiniProfiler SqlServerStorage becomes quite slow

We use mini profiler in two ways:
On developer machines with the pop-up
In our staging/prod environments with SqlServerStorage storing to MS SQL
After a few weeks we find that writing to the profiling DB takes a long time (seconds), and is causing real issues on the site. Truncating all profiler tables resolves the issue.
Looking through the SqlServerStorage code, it appears the inserts also do a check to make sure a row with that id doesnt already exist. Is this to ensure DB agnostic code? This seems it would introduce a massive penalty as the number of rows increases.
How would I go about removing the performance penalty from the performance profiler? Is anyone else experiencing this slow down? Or is it something we are doing wrong?
Cheers for any help or advice.
Hmm, it looks like I made a huge mistake in how that MiniProfilers table was created when I forgot about primary key being clustered by default... and the clustered index is a GUID column, a very big no-no.
Because data is physically stored on disk in the same order as the clustered index (indeed, one could say the table is the clustered index), SQL Server has to keep every newly inserted row in that physical order. This becomes a nightmare to keep sorted when we're using essentially a random number.
The fix is to add an auto-increasing int and switch the primary key to that, just like all the other tables (why I overlooked this, I don't remember... we don't use this storage provider here on Stack Overflow or this issue would have been found long ago).
I'll update the table creation scripts and provide you with something to migrate your current table in a bit.
Edit
After looking at this again, the main MiniProfilers table could just be a heap, meaning no clustered index. All access to the rows is by that guid ID column, so no physical ordering would help.
If you don't want to recreate your MiniProfiler sql tables, you can use this script to make the primary key nonclustered:
-- first remove the clustered index from the primary key
declare #clusteredIndex varchar(50);
select #clusteredIndex = name
from sys.indexes
where type_desc = 'CLUSTERED'
and object_name(object_id) = 'MiniProfilers';
exec ('alter table MiniProfilers drop constraint ' + #clusteredIndex);
-- and then make it non-clustered
alter table MiniProfilers add constraint
PK_MiniProfilers primary key nonclustered (Id);
Another Edit
Alrighty, I've updated the creation scripts and added indexes for most querying - see the code here in GitHub.
I would highly recommended dropping all your existing tables and rerunning the updated script.

oracle- index organized table

what is use-case of IOT (Index Organized Table) ?
Let say I have table like
id
Name
surname
i know the IOT but bit confuse about the use case of IOT
Your three columns don't make a good use case.
IOT are most useful when you often access many consecutive rows from a table. Then you define a primary key such that the required order is represented.
A good example could be time series data such as historical stock prices. In order to draw a chart of the stock price of a share, many rows are read with consecutive dates.
So the primary key would be stock ticker (or security ID) and the date. The additional columns could be the last price and the volume.
A regular table - even with an index on ticker and date - would be much slower because the actual rows would be distributed over the whole disk. This is because you cannot influence the order of the rows and because data is inserted day by day (and not ticker by ticker).
In an index-organized table, the data for the same ticker ends up on a few disk pages, and the required disk pages can be easily found.
Setup of the table:
CREATE TABLE MARKET_DATA
(
TICKER VARCHAR2(20 BYTE) NOT NULL ENABLE,
P_DATE DATE NOT NULL ENABLE,
LAST_PRICE NUMBER,
VOLUME NUMBER,
CONSTRAINT MARKET_DATA_PK PRIMARY KEY (TICKER, P_DATE) ENABLE
)
ORGANIZATION INDEX;
Typical query:
SELECT TICKER, P_DATE, LAST_PRICE, VOLUME
FROM MARKET_DATA
WHERE TICKER = 'MSFT'
AND P_DATE BETWEEN SYSDATE - 1825 AND SYSDATE
ORDER BY P_DATE;
Think of index organized tables as indexes. We all know the point of an index: to improve access speeds to particular rows of data. This is a performance optimisation of trick of building compound indexes on sub-sets of columns which can be used to satisfy commonly-run queries. If an index can completely satisy the columns in a query's projection the optimizer knows it doesn't have to read from the table at all.
IOTs are just this approach taken to its logical confusion: buidl the index and throw away the underlying table.
There are two criteria for deciding whether to implement a table as an IOT:
It should consists of a primary key (one or more columns) and at most one other column. (okay, perhaps two other columns at a stretch, but it's an warning flag).
The only access route for the table is the primary key (or its leading columns).
That second point is the one which catches most people out, and is the main reason why the use cases for IOT are pretty rare. Oracle don't recommend building other indexes on an IOT, so that means any access which doesn't drive from the primary key will be a Full Table Scan. That might not matter if the table is small and we don't need to access it through some other path very often, but it's a killer for most application tables.
It is also likely that a candidate table will have a relatively small number of rows, and is likely to be fairly static. But this is not a hard'n'fast rule; certainly a huge, volatile table which matched the two criteria listed above could still be considered for implementations as an IOT.
So what makes a good candidate dor index organization? Reference data. Most code lookup tables are like something this:
code number not null primary key
description not null varchar2(30)
Almost always we're only interested in getting the description for a given code. So building it as an IOT will save space and reduce the access time to get the description.

Resources