List partitioning in oracle

List partitioning in oracle - oracle

I have a table with following schema:
MyTable {
User_ID,
Task_ID,
Task_Description
}
where Task_ID is the primary key.
I wish to partition it on User_ID. Also, new users may be added and I want new corresponding partitions to get created automatically.
I went through this (see page-8) and found that Oracle 11g provides Interval partitioning which does similar thing but with intervals.
Can I do the same with User_ID?

You can't automatically generate a unique partition for each distinct varchar(128) value.
You could hash partition the table. That would not guarantee that every partition had a single, unique user_id value. It would ensure that all the rows with the same user_id were in a single partition and would eliminate the need to do manual partition maintenance.
You could list partition the table. That would require, though, that you explicitly add a new partition when a new user_id value is added.
If the user_id values were strictly predictable, you could probably do something with an interval partitioning scheme on a virtual column. But that seems highly unlikely to be practical.
What is the business problem that you are trying to solve? Why is it necessary to have a single user_id value in each partition? Why are you partitioning the table in the first place?

Related

Fast data migration on the same database

I'm trying to find a way to perform a migration from two tables on the same database. This migration should be as fast as possible in order to minimize the downtime.
To put it on an example lets say I have a person table like so:
person_table -> (id, name, address)
So a person as an Id, a name and an address. My system will contain millions of person registries and it was decided that the person table should be partitioned. To do so, I've created a new table:
partitioned_person_table->(id,name,address,partition_time)
Now this table will contain an extra column called partition_time. This is the partition key for this table since this is a range partition (one partition every hour).
Finally, I need to find a way to move all the information from the person_table to the partitioned_person_table with the best performance.
The first thing I could try is to simply create a statement like:
INSERT INTO partitioned_person_table (id, name, address, partition_time)
SELECT id, name, address, CURRENT_TIMESTAMP FROM person_table;
The problem is that when it comes to millions of registries this might become very slow (also the temporary tablespace might not be able to handle all this information)
My second approach was to use the EXCHANGE PARTITION method. Unfortunetly, I cannot do this because the tables contain diffrent column numbers.
Is there any other way that I can perfom this with the best performance (less downtime) ?
Thank you.

If you can live with the state, that all the current records would be located in one partition (and your INSERT approach suggest that), you may only
1) add a new column partition_time either as NULL or possible with metadata default only - required 12c
2) switch the table to a partitioned table either with online redefinition (if you have no maintainace window, where the table is offline) or with exchange partition otherwise.

Oracle 11g Reference Partitioning and Indexes

I have a parent table containing the following columns:
- PARENT_ID: UUID
- EVENT_DATE: TIMESTAMP
- DATA_COLUMN1: VARCHAR2(255)
- DATA_COLUMN2: VARCHAR2(255)
The table is range partitioned by EVENT_DATE. Data is only retained for a month and the last partition is dropped on a daily basis.
Following my understanding, using a global index for PK would result in sub-standard performance when dropping a partition. This means that the PK of this table has to be based on both PARENT_ID + EVENT_DATE in order to create a local index.
I have a second table that is the child of the first (via one-to-many relationship). It has the following columns:
- CHILD_ID: UUID
- PARENT_ID: UUID - FK into parent table
- DATA_COLUMN3: VARCHAR2(255)
- DATA_COLUMN4: VARCHAR2(255)
To partition the child table, I decided to use reference partitioning. One of its big advantages: it removes the need to duplicate the partition key in the child table. However, based on my reasoning, the only way to achieve this is through global indexes. Here is my train of thought:
For the unique index of the parent table to be local, the PK must include the partition key, e.g. EVENT_DATE.
A foreign key constrain cannot reference only part of the PK. Thus the child table must include both PARENT_ID and EVENT_DATE columns.
What's more, I also read that "When using reference partitioning, most child table indexes
should be defined as global, unless there is a compelling reason
for a given index to be defined as local." (http://www.nocoug.org/download/2010-05/Zitelli-Reference_Partitioning_NoCOUG.pdf).
Am I missing something or is there no way to use reference partitioning without having global indexes or duplicating data?
An explanation on how reference partitioning works with local/global indexes will be much appreciated!

You understand correctly. if you want to create a reference partition you need to define a valid FK. in your case - both parent_id and event_id needs to be present in the child table.
Ref partitions are for cases when you want to partition a table according to a column not in the PK those not in the child table. this is not your case - you can apply range partition on both tables and gain max pruning.
The event_date in the child table is not redundant - it's required by the model - you need data_columns 3/4 in the child table for each instance of parent_id + event_date.
Regarding the local indexes on a child table in a ref partitioned table - my logic say exactly the opposite. if i have a ref partition i'm aiming at max pruning which means that i want as less partitions as possible to be accessed in each query. in this case i would want local indexes and not global.
You said "using a global index for PK would result in sub-standard performance when dropping a partition". when you drop a partition from the table all global indexes will be invalidated and you will have to rebuild them. this is the only performance impact regarding DDL changes. PK on a partitioned table must be global so you don't have a choice here any way.

Though above answer has been answered but on you thought: 'A foreign key constrain cannot reference only part of the PK. Thus the child table must include both PARENT_ID and EVENT_DATE columns.'
I believe FK can refer to part of PK if your attribute has been defined with Unique constraint & I see that possible in your example.
Example:
Table A ( orderid, orderdate, custid)
PF -> OrderID, OrderDate
Table B(OrderID, ItemID)
In table B, orderID can be FK if OrderID is defined as Unique in Table A.
I hope this helps.

Unique Indexes with Oracle partitioned tables

I have a table Customer_Chronics in Oracle 11g.
The table has three key columns as shown below :
branch_code
customer_id
period
I have partitioned by table by list of branch_code, and now I'm having dilemma. Which is better:
Create unique index indexNumberOne on Customer_Chronics (PERIOD, CUSTOMER_ID);
Create unique index indexNumberTwo on Customer_Chronics (branch_code, PERIOD, CUSTOMER_ID);
The actual data must be unique by period, customer_id. If I put a unique index only on these two columns Oracle will check all partitions on the table when inserting new records?

The only way to enforce uniqueness is with a unique constraint on the columns of interest. So that's your first option. The database will check all values across all partitions it this case. But as it's a unique index that shouldn't take too long no matter how big the table gets (if that's your concern).

Yes, If you put unique index on that two columns only, Oracle will create a global index and will check all partitions. This is one of challenges I face sometime because we prefer local indexs for big tables (small tables should be OK).

oracle- index organized table

what is use-case of IOT (Index Organized Table) ?
Let say I have table like
id
Name
surname
i know the IOT but bit confuse about the use case of IOT

Your three columns don't make a good use case.
IOT are most useful when you often access many consecutive rows from a table. Then you define a primary key such that the required order is represented.
A good example could be time series data such as historical stock prices. In order to draw a chart of the stock price of a share, many rows are read with consecutive dates.
So the primary key would be stock ticker (or security ID) and the date. The additional columns could be the last price and the volume.
A regular table - even with an index on ticker and date - would be much slower because the actual rows would be distributed over the whole disk. This is because you cannot influence the order of the rows and because data is inserted day by day (and not ticker by ticker).
In an index-organized table, the data for the same ticker ends up on a few disk pages, and the required disk pages can be easily found.
Setup of the table:
CREATE TABLE MARKET_DATA
(
TICKER VARCHAR2(20 BYTE) NOT NULL ENABLE,
P_DATE DATE NOT NULL ENABLE,
LAST_PRICE NUMBER,
VOLUME NUMBER,
CONSTRAINT MARKET_DATA_PK PRIMARY KEY (TICKER, P_DATE) ENABLE
)
ORGANIZATION INDEX;
Typical query:
SELECT TICKER, P_DATE, LAST_PRICE, VOLUME
FROM MARKET_DATA
WHERE TICKER = 'MSFT'
AND P_DATE BETWEEN SYSDATE - 1825 AND SYSDATE
ORDER BY P_DATE;

Think of index organized tables as indexes. We all know the point of an index: to improve access speeds to particular rows of data. This is a performance optimisation of trick of building compound indexes on sub-sets of columns which can be used to satisfy commonly-run queries. If an index can completely satisy the columns in a query's projection the optimizer knows it doesn't have to read from the table at all.
IOTs are just this approach taken to its logical confusion: buidl the index and throw away the underlying table.
There are two criteria for deciding whether to implement a table as an IOT:
It should consists of a primary key (one or more columns) and at most one other column. (okay, perhaps two other columns at a stretch, but it's an warning flag).
The only access route for the table is the primary key (or its leading columns).
That second point is the one which catches most people out, and is the main reason why the use cases for IOT are pretty rare. Oracle don't recommend building other indexes on an IOT, so that means any access which doesn't drive from the primary key will be a Full Table Scan. That might not matter if the table is small and we don't need to access it through some other path very often, but it's a killer for most application tables.
It is also likely that a candidate table will have a relatively small number of rows, and is likely to be fairly static. But this is not a hard'n'fast rule; certainly a huge, volatile table which matched the two criteria listed above could still be considered for implementations as an IOT.
So what makes a good candidate dor index organization? Reference data. Most code lookup tables are like something this:
code number not null primary key
description not null varchar2(30)
Almost always we're only interested in getting the description for a given code. So building it as an IOT will save space and reduce the access time to get the description.

Is an Index Organized Table appropriate here?

I recently was reading about Oracle Index Organized Tables (IOTs) but am not sure I quite understand WHEN to use them. So I have a small table:
create table categories
(
id VARCHAR2(36),
group VARCHAR2(100),
category VARCHAR2(100
)
create unique index (group, category, id) COMPRESS 2;
The id column is a foreign key from another table entries and my common query is:
select e.id, e.time, e.title from entries e, categories c where e.id=c.id AND e.group=? AND c.category=? ORDER by e.time
The entries table is indexed properly.
Both of these tables have millions (16M currently) of rows and currently this query really stinks (note: I have it wrapped in a pagination query also so I only get back the first 20, but for simplicity I omitted that).
Since I am basically indexing the entire table, does it make sense to create this table as an IOT?
EDIT by popular demand:
create table entries
(
id VARCHAR2(36),
time TIMESTAMP,
group VARCHAR2(100),
title VARCHAR2(500),
....
)
create index (group, time) compress 1;
My real question I dont think depends on this though. Basically if you have a table with few columns (3 in this example) and you are planning on putting a composite index on all three rows is there any reason not to use an IOT?

IOTs are great for a number of purposes, including this case where you're gonna have an index on all (or most) of the columns anyway - but the benefit only materialises if you don't have the extra index - the idea is that the table itself is an index, so put the columns in the order that you want the index to be in. In your case, you're accessing category by id, so it makes sense for that to be the first column. So effectively you've got an index on (id, group, category). I don't know why you'd want an additional index on (group, category, id).
Your query:
SELECT e.id, e.time, e.title
FROM entries e, categories c
WHERE e.id=c.id AND e.group=? AND c.category=?
ORDER by e.time
You're joining the tables by ID, but you have no index on entries.id - so the query is probably doing a hash or sort merge join. I wouldn't mind seeing a plan for what your system is doing now to confirm.
If you're doing a pagination query (i.e. only interested in a small number of rows) you want to get the first rows back as quick as possible; for this to happen you'll probably want a nested loop on entries, e.g.:
NESTED LOOPS
ACCESS TABLE BY ROWID - ENTRIES
INDEX RANGE SCAN - (index on ENTRIES.group,time)
ACCESS TABLE BY ROWID - CATEGORIES
INDEX RANGE SCAN - (index on CATEGORIES.ID)
Since the join to CATEGORIES is on ID, you'll want an index on ID; if you make it an IOT, and make ID the leading column, that might be sufficient.
The performance of the plan I've shown above will be dependent on how many rows match the given "group" - i.e. how selective an average "group" is.

Have you looked at dba-oracle.com, asktom.com, IOUG, another asktom.com?
There are penalties to pay for IOTs - e.g., poorer insert performance
Can you prototype it and compare performance?
Also, perhaps you might want to consider a hash cluster.

IOT's are a trade off. You are getting access performance for decreased insert/update performance. We typically use them for reference data that is batch loaded daily and not updated during the day. This is not to say it's the only way to use them, just how we use them.
Few things here:
You mention pagination - have you considered the first_rows hint?
Is that the order your index is in, with group as the first field? If so I'd consider moving ID to be the first column since that index will not be used.
foreign keys should have an index on the column. Consider addind an index on the foreign key (id column).
Are you sure it's not the ORDER BY causing slowness?

What version of Oracle are you using?
I ASSUME there is a primary key on table entries for field id, correct?
Why the WHERE condition does not include "c.group = e.group" ?
Try to:
Remove the order by condition
Change the index definition from "create unique index (group,
category, id)" to "create unique index (id, group, category)"
Reorganise table categories as an IOT on (group, category, id)
Reorganise table categories as an IOT on (id, group, category)
In each of the above case use EXPLAIN PLAN to review the cost

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio