Partition by Hash - Subpartition by List - oracle

I want to create a table where I do partition by hash on one column and subpartition by list on another column. Table creation should look like below:
CREATE TABLE testt
(
Id CHAR(3),
time DATE,
month AS (EXTRACT (MONTH FROM time))
)
PARTITION BY HASH (Id)
PARTITIONS 4
STORE IN (ts1, ts2, ts3, ts4)
SUBPARTITION BY LIST (month)
SUBPARTITION template
(
SUBPARTITION JANUARY VALUES (01),
SUBPARTITION FEBRUARY VALUES (02),
...
)
I need to maintain partition by hash for legacy reasons. I can change subpartition to Range/Hash.
But Oracle is simply not letting me create Partition by hash + subpartition by list/range/hash. I searched a lot but didn't get even one example. Now I am wondering if it is even supported or not. Can someone please let me know how to do it?

Your statement has invalid syntax, see http://docs.oracle.com/database/121/SQLRF/statements_7002.htm#CJABBBAI.
The specification of hash partition count and tablespaces should be after the subpartition templates.
CREATE TABLE testt
(
Id CHAR(3),
time DATE,
month AS (EXTRACT (MONTH FROM time))
)
PARTITION BY HASH (Id)
SUBPARTITION BY LIST (month)
SUBPARTITION template (
SUBPARTITION JANUARY VALUES (01),
SUBPARTITION FEBRUARY VALUES (02),
...
)
PARTITIONS 4
STORE IN (ts1, ts2, ts3, ts4)

Related

Can I create a growing Interval partitioned table with a default/maxvalue partition?

Summary of the question: To Create table with partitions which are range partitioned. However records which do not know the range value should reside in a different (default) partition and be moved to the correct partition when the value is filled. The default partition would never be dropped while the other partitions would be dropped after a defined retention period via an script.
The whole story:
I have a table where the records have to be placed in a partition based on a date field. This is a growing table and after some time the data from these partitions can be purged. I used to create table with something like the snippet below.
This works fine because we knew the value of the date column based on which we partition (RDATE). However in our new project we do not know this when a record is inserted. The value would eventually be filled in during the course of the application processing.
My initial thought was to create MAXPARTITION (MAXVALUE) which would be a catch-all partition for records which do not have the date filled and enable ROW MOVEMENTS so that when the date is filled it moves into an appropriate partition. However I think it is not possible to have both MAXVALUE partition and interval partitioning together. Is that right?
Also Is there a better way to do this?
PARTITION BY RANGE ("RDATE") INTERVAL (NUMTODSINTERVAL (1,'DAY'))
SUBPARTITION BY HASH ("RKEY")
SUBPARTITION TEMPLATE (
SUBPARTITION "SP01",
SUBPARTITION "SP02",
SUBPARTITION "SP03",
SUBPARTITION "SP04",
SUBPARTITION "SP05",
SUBPARTITION "SP06",
SUBPARTITION "SP07",
SUBPARTITION "SP08",
SUBPARTITION "SP09",
SUBPARTITION "SP10",
SUBPARTITION "SP11",
SUBPARTITION "SP12",
SUBPARTITION "SP13",
SUBPARTITION "SP14",
SUBPARTITION "SP15",
SUBPARTITION "SP16" )
(PARTITION "INITIALPARTITION" VALUES LESS THAN (TO_DATE(' 2016-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
I expect a table with default and range partitions and records to move to the range partitions from the default when a column is filled.
The column you use as partition key cannot be NULL but you can use a workaround like this:
CREATE TABLE ... (
...
RDATE DATE,
PARTITION_KEY DATE GENERATED ALWAYS AS (COALESCE(RDATE, DATE '1969-12-31'))
)
PARTITION BY RANGE (PARTITION_KEY) INTERVAL (NUMTODSINTERVAL (1,'DAY'))
...
(PARTITION INITIAL_PARTITION VALUES LESS THAN (DATE '1970-01-01'))
ENABLE ROW MOVEMENT;
If you insert a record with RDATE = NULL then it will be inserted into partition INITIAL_PARTITION. For the initial data (e.g. 1970-01-01) you must select a values whicc will never fall into the "real" date values. You could also use a date in far future, e.g.
CREATE TABLE ... (
...
RDATE DATE,
PARTITION_KEY DATE GENERATED ALWAYS AS (COALESCE(RDATE, DATE '2999-12-31'))
)
PARTITION BY RANGE (PARTITION_KEY) INTERVAL (NUMTODSINTERVAL (1,'DAY'))
...
(PARTITION INITIAL_PARTITION VALUES LESS THAN (DATE '2019-04-01'))
ENABLE ROW MOVEMENT;
-- Create DEFAULT_PARTITION
INSERT INTO ... (RDATE) VALUES (NULL);
ROLLBACK;
ALTER TABLE ... RENAME PARTITION FOR (TIMESTAMP '2999-12-31 00:00:00') TO DEFAULT_PARTITION;

Table partition on non date column

How can I partition a table in oracle on non-date column (Say partition on Username)?
I have table partitioning on only date columns.Say:
CREATE TABLE X
(
Username Varchar2(10 Char),
Import_date Date
)
PARTITION BY RANGE ("IMPORT_DATE") INTERVAL (NUMTODSINTERVAL(1,'DAY'))
(PARTITION "CL_REP_DEF" VALUES LESS THAN
(TO_DATE(' 2018-06-29 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
)
Though I am not sure how to partition with username here.
Oracle offers three types of partitions:
Range
Hash
List
You can use any of them.
Selection of partitioning type depends on the data stored in a table and values of partitioned column (columns). If the number of distinct values in a column (columns) is limited and known, then LIST type would be a better choice.
As to your case, I think HASH partition fits the most.
Here's an example of how you can partition your X table:
CREATE TABLE X
(
Username Varchar2(10 Char),
Import_date Date
) PARTITION BY HASH(Username) PARTITIONS 16; -- 16 is the number of partitions.
You can find more about partitioning in official Oracle documentation.

Oracle 12c - Table with size larger than 5 terabytes

In our database(Oracle 12c, Exadata) we plan to store sales data. Input text files containing sales data comes daily basis(~1000 files every day each containing ~20000 rows). Text files are read and transfered to db asap. Acccording to our calculations it will grow up to 5 terabytes in one year.
Data format:
[transaction date][category][sales_number][buyer_id][other columns]
sales data comes in 10 different categories with same fields. Data logically can be stored in just one single table or can be divided into 10 tables (with respect to categories).
What is the best practice for storing such kind of big data in oracle? What kind of partitioning and indexing strategy should be applied?
Constraints: Data should be available for analysis in 2-3 days to marketing department. Queries based on [sales_number] or [category],[buyer_id] or [buyer_id] columns.
If the number of categories is known and fix then you can use a subpartition for each category.
One approach could be this one:
CREATE TABLE SALES_DATA
(
TRANSACTION_DATE TIMESTAMP(0) NOT NULL,
CATEGORY NUMBER NOT NULL,
SALES_NUMBER NUMBER,
BUYER_ID NUMBER,
[OTHER COLUMNS]
)
PARTITION BY RANGE (TRANSACTION_DATE) INTERVAL (INTERVAL '1' DAY)
SUBPARTITION BY LIST (CATEGORY)
SUBPARTITION TEMPLATE
(
SUBPARTITION CAT_1 VALUES (1),
SUBPARTITION CAT_2 VALUES (2),
SUBPARTITION CAT_3_AND_4 VALUES (3,4),
SUBPARTITION CAT_5 VALUES (5),
...
SUBPARTITION CAT_10 VALUES (10),
SUBPARTITION CAT_OTHERS VALUES (DEFAULT)
)
(
PARTITION P_INITIAL VALUES LESS THAN (TIMESTAMP '2018-01-01 00:00:00')
);
Local indexes would be needed on sales_number and buyer_id. You can put every (sub)partition into a separated tablespace if required.

Extending existing partitioning

Below the simplified structure of a table:
create table customer(
incident_id number,
customer_id number,
customer_name varchar2(400),
sla_id number
failure_start_date date,
failure_end_date date,
churn_flag number, -- 0 or 1
active number, -- 0 or 1
constraint pk_incident_id primary key (incident_id))
PARTITION BY LIST (active)
SUBPARTITION BY LIST (churn_flag)
SUBPARTITION TEMPLATE
( SUBPARTITION sp_churn_flag_1 VALUES (1)
, SUBPARTITION sp_churn_flag_0 VALUES (0)
)
(PARTITION sp_active_1 values (1)
, PARTITION sp_active_0 VALUES (0)
)
,
ENABLE ROW MOVEMENT COMPRESS FOR QUERY LOW;
Now I need to add additonally to the existing Composite-List-Partition an Interval-Range-Partitioning, in order to partitionate the data by month (failure_starte_date - YYYYMM). The table contains data from 200701 up to now (201511). Failure_start_date < 2013 should be partitionied into one partition for older data. All newer months should have an dedicated partition, whereas partitions for upcoming months shall be created automatically.
How can this be integrating into the already existing partitoning?
You cannot do it exactly the way you want. Partitioning strategies are limited in two relevant ways: first, composite strategies can only have two levels (you need 3) and second, interval partitioning, when used in a composite strategy must be at the top level.
Here is the closest legal thing to what you want:
CREATE TABLE matt_customer
(
incident_id NUMBER,
customer_id NUMBER,
customer_name VARCHAR2 (400),
sla_id NUMBER,
failure_start_date DATE,
failure_end_date DATE,
churn_flag VARCHAR2 (1), -- 0 or 1
active VARCHAR2 (1), -- 0 or 1
active_churn_flags VARCHAR2 (2) GENERATED ALWAYS AS (active || churn_flag) VIRTUAL,
CONSTRAINT pk_incident_id PRIMARY KEY (incident_id)
)
PARTITION BY RANGE
(failure_start_date)
INTERVAL ( NUMTOYMINTERVAL (1, 'MONTH') )
SUBPARTITION BY LIST
(active_churn_flags)
SUBPARTITION TEMPLATE (
SUBPARTITION sp_ac_00 VALUES ('00'),
SUBPARTITION sp_ac_01 VALUES ('01'),
SUBPARTITION sp_ac_10 VALUES ('10'),
SUBPARTITION sp_ac_11 VALUES ('11'))
(PARTITION customer_old VALUES LESS THAN (TO_DATE ('01-JAN-2013', 'DD-MON-YYYY')))
ENABLE ROW MOVEMENT
--COMPRESS FOR QUERY LOW;
;
This uses interval-list partitioning, and uses a virtual column to combine your active and churn_flag columns into one (I turned those columns into VARCHAR2(1) for simplicity.
To make use of partition pruning, your queries would need to be modified to select active_churn_flags = '01' for example, instead of specifying values for active and churn_flag independently.

ORACLE - Partitioning with changing values

Assuming following table:
create table INVOICE(
INVOICE_ID NUMBER
,INVOICE_SK NUMBER
,INVOICE_AMOUNT NUMBER
,INVOICE_TEXT VARCHAR2(4000 Char)
,B2B_FLAG NUMBER -- 0 or 1
,ACTIVE NUMBER(1) -- 0 or 1
)
PARTITION BY LIST (ACTIVE)
SUBPARTITION BY LIST (B2B_FLAG)
( PARTITION p_active_1 values (1)
( SUBPARTITION sp_b2b_flag_11 VALUES (1)
, SUBPARTITION sp_b2b_flag_10 VALUES (0)
)
,
PARTITION p_active_0 values (0)
( SUBPARTITION sp_b2b_flag_01 VALUES (1)
, SUBPARTITION sp_b2b_flag_00 VALUES (0)
)
)
For perfomance reasons the table should get a "Composite List-List" partitioning, see http://docs.oracle.com/cd/E18283_01/server.112/e16541/part_admin001.htm#i1006565.
The problematic point is, that the ACTIVE-Flag will change requently for a huge amount of records and sometimes also the B2B_FLAG. Will Oracle automatically recognize the records, for which the partitioning value has changed and move them to the appropriate partion or do I have to call some kind of maintenance function, in order to reorganize the partitions?
You need to enable row movement on the table or the update statement will fail with ORA-14402: updating partition key column would cause a partition change.
See the following testcase:
create table T_TESTPART
(
pk number(10),
part_key number(10)
)
partition by list (part_key) (
partition p01 values (1),
partition p02 values (2),
partition pdef values (default)
);
alter table T_TESTPART
add constraint pk_pk primary key (PK);
Now insert a row and try to update the partitioning value:
insert into t_testpart values (1,1);
update t_testpart set part_key = 2 where pk = 1;
You will now get the Error mentioned above.
If you enable row movement, the same statement will work and oracle will move the row to the other partition:
alter table t_testpart enable row movement;
update t_testpart set part_key = 2 where pk = 1;
I did not do any performance tests, but Oracle will probably delete the row from the first partition and insert it to the second partition. Consider this when using it in large scale.
In my own databases, I usually only use partitioning on columns that do not change.
Further reading:
http://www.dba-oracle.com/t_callan_oracle_row_movement.htm

Resources