Do projections in vertica have primary keys, secondary keys? How can I find out what is the key of a projection?
You had best go into Vertica's docu on projections:
https://www.vertica.com/docs/10.0.x/HTML/Content/Authoring/SQLReferenceManual/Statements/CREATEPROJECTION.htm
Primary and foreign keys exist, as do unique constraints - as constraints; but these constraints are usually disabled - because they slow down a load / insert process considerably.
Even if you choose to not specify segmentation and ordering clause of a projection: each projection is either unsegmented or segmented by a value that depends from the contents of one or more non-nullable columns (usually a HASH() on one or more columns), and ORDERed by one or more columns. The ORDER BY clause in a projection definition constitutes the data access path used in that projection. It can be somehow compared to indexing in classical databases.
To find out what the access path of a projection is - the quickest way is to fire a SELECT EXPORT_OBJECTS('','<tablename>', FALSE) at it. In our previously used example, you see that it's ordered by all its four columns, and segmented by the HASH() of all its four columns, as we created the table with no primary or foreign key:
$ vsql -Atc "SELECT EXPORT_OBJECTS('','example',FALSE)"
CREATE TABLE dbadmin.example
(
fname varchar(4),
lname varchar(5),
hdate date,
salary numeric(7,2)
);
CREATE PROJECTION dbadmin.example_super /*+basename(example),createtype(L)*/
(
fname,
lname,
hdate,
salary
)
AS
SELECT example.fname,
example.lname,
example.hdate,
example.salary
FROM dbadmin.example
ORDER BY example.fname,
example.lname,
example.hdate,
example.salary
SEGMENTED BY hash(example.fname, example.lname, example.hdate, example.salary) ALL NODES OFFSET 0;
Related
We have realized that we need to use Oracle (12 c) interval partitioning.We have an hierarchical entity model with lots of #OneToMany relationships.We want to use "partitioning by range" (day,month... ) on the "parent/root" entity (A) and "partition by reference" on all child entities (B).
From the Oracle documentation: "Reference partitioning allows the partitioning of two tables related to one another by referential constraints. The partitioning key is resolved through an existing parent-child relationship, enforced by enabled and active primary key and foreign key constraints." The problem is that child entities (B) can refer to other entities ( C) that they don't have any link with the "parent/root" entity (A). I can create partitions on A and on B but when I want to drop the partition on A ( partition on B on cascade), I get an error integrity error and it fails. it works only if I delete all records on C and B and then partition them. I don't want to do that as it's not efficient and slow compared to dropping partitions directly
Please is there a way to create a partition on table C based on A(creation_date) without adding any foreign constraint between A and C?
Small example to illustrate the case
A - parent entity
B - child entity to A
C - child entity to B
create table
A (
id number primary key,
creation_date date
)
partition by range (creation_date)
INTERVAL(NUMTOYMINTERVAL(1, 'MONTH'))
(
partition p1 values less than (to_date('20180501','yyyymmdd'))
);
create table
B (
id number primary key,
value varchar2(5),
a_id number not null,
constraint fk_ba foreign key (a_id) references A
)
partition by reference(fk_ba);
create table
C (
id number primary key,
code varchar2(5),
b_id number not null,
constraint fk_cb foreign key (b_id) references B
);
It was not clear in the documentation that partition by reference is transitive ( A->B->C). I have discovered it by testing. we can create table C "partition by reference (fk_cb)" with the result that when we drop A partition the matching B and C partitions get dropped.
I am having multiple products and each of them are having there own Product table and Value table. Now I have to create a generic screen to validate those product and I don't want to create validated table for each Product. I want to create a generic table which will have all the Products details and one extra column called ProductIdentifier. but the problem is that here in this generic table I may end up putting millions of records and while fetching the data it will take time.
Is there any other better solution???
"Millions of records" sounds like a VLDB problem. I'd put the data into a partitioned table:
CREATE TABLE myproducts (
productIdentifier NUMBER,
value1 VARCHAR2(30),
value2 DATE
) PARTITION BY LIST (productIdentifier)
( PARTITION p1 VALUES (1),
PARTITION p2 VALUES (2),
PARTITION p5to9 VALUES (5,6,7,8,9)
);
For queries that are dealing with only one product, specify the partition:
SELECT * FROM myproducts PARTITION FOR (9);
For your general report, just omit the partition and you get all numbers:
SELECT * FROM myproducts;
Documentation is here:
https://docs.oracle.com/en/database/oracle/oracle-database/12.2/vldbg/toc.htm
I have a target table T1 which doesn't have a Date field. The current size is increasing rapidly. Hence I need to add a Date field and also perform table partitioning on this target table.
T1 has PRIMARY KEY (DOCID, LABID)
and has CONSTRAINT FOREIGN KEY (DOCID) REFERENCES T2
Table T2 is also complex table and has many rules in it.
T2 has PRIMARY KEY (DOCID)
My question is, as I need to partition T1. Is it possible NOT TO perform any step for T2 before T1 is partition? DBA told me that I need to partition T2 first before I touch T1??
There is no need to partition T2 before partitioning T1. Foreign key constraints do not care in the slightest bit about partitioning.
Best of luck.
You have – as proposed by others - two options. The first one is to add a redundant column DATE to the table T1 (child) and introduce the range partitioning on this column.
The second option is to use reference partitioning. Below is the simplified DDL for those options.
Range partitioning on child
create table T2_P2 /* parent */
(docid number not null,
trans_date date not null,
pad varchar2(100),
CONSTRAINT t2_p2_pk PRIMARY KEY(docid)
);
create table T1_P2 /* child */
(docid number not null,
labid number not null,
trans_date date not null, /** redundant column **/
pad varchar2(100),
CONSTRAINT t1_p2_pk PRIMARY KEY(docid, labid),
CONSTRAINT t1_p2_fk
FOREIGN KEY(docid) REFERENCES T2_P2(docid)
)
PARTITION BY RANGE(trans_date)
( PARTITION Q1_2016 VALUES LESS THAN (TO_DATE('01-APR-2016','DD-MON-YYYY'))
);
Reference partition
create table T2_RP /* parent */
(docid number not null,
trans_date date not null,
pad varchar2(100),
CONSTRAINT t2_rp_pk PRIMARY KEY(docid)
)
PARTITION BY RANGE(trans_date)
( PARTITION Q1_2016 VALUES LESS THAN (TO_DATE('01-APR-2016','DD-MON-YYYY'))
);
create table T1_RP /* child */
(docid number not null,
labid number not null,
pad varchar2(100),
CONSTRAINT t1_rp_pk PRIMARY KEY(docid, labid),
CONSTRAINT t1_rp_fk
FOREIGN KEY(docid) REFERENCES T2_RP(docid)
)
PARTITION BY REFERENCE(t1_rp_fk);
Your question is basically if the first option is possible, so the answer is YES.
To decide if the first option is preferable I’d suggest checking three criteria:
Migration
The first option requires a new DATE column in the child table that must be initialized during the migration (and of course correct maintained by the application).
Lifecycle
It could be that the lifecycle of both tables is the same (e.g. both parent and child records are kept for 7 year). In this case is preferable if both tables are partitioned (on the same key).
Access
For queries such as below you will profit from the reference partitioning (both tables are pruned - i.e. only the partitions with the accessed date are queried).
select * from T2_RP T2 join T1_RP T1 on t2.docid = t1.docid
where t2.trans_date = to_date('01012016','ddmmyyyy');
In the first option you will (probalbly) end with FTS on T2 and to get pruning on T1 you need to add predicate T2.trans_date = t1.trans_date
Having said that, I do not claim that the reference partition is superior. But I think it's worth to examine both options in your context and see which one is better.
It's been a while since I've written code, and I never used SQLite before, but many-to-many relationships used to be so fundamental, there must be a way to make them fast...
This is a abstracted version of my database:
CREATE TABLE a (_id INTEGER PRIMARY KEY, a1 TEXT NOT NULL);
CREATE TABLE b (_id INTEGER PRIMARY KEY, fk INTEGER NOT NULL REFERENCES a(_id));
CREATE TABLE d (_id INTEGER PRIMARY KEY, d1 TEXT NOT NULL);
CREATE TABLE c (_id INTEGER PRIMARY KEY, fk INTEGER NOT NULL REFERENCES d(_id));
CREATE TABLE b2c (fk_b NOT NULL REFERENCES b(_id), fk_c NOT NULL REFERENCES c(_id), CONSTRAINT PK_b2c_desc PRIMARY KEY (fk_b, fk_c DESC), CONSTRAINT PK_b2c_asc UNIQUE (fk_b, fk_c ASC));
CREATE INDEX a_a1 on a(a1);
CREATE INDEX a_id_and_a1 on a(_id, a1);
CREATE INDEX b_fk on b(fk);
CREATE INDEX b_id_and_fk on b(_id, fk);
CREATE INDEX c_id_and_fk on c(_id, fk);
CREATE INDEX c_fk on c(fk);
CREATE INDEX d_id_and_d1 on d(_id, d1);
CREATE INDEX d_d1 on d(d1);
I have put in any index i could think of, just to make sure (and more than is reasonable, but not a problem, since the data is read only). And yet on this query
SELECT count(*)
FROM a, b, b2c, c, d
WHERE a.a1 = "A"
AND a._id = b.fk
AND b._id = b2c.fk_b
AND c._id = b2c.fk_c
AND d._id = c.fk
AND d.d1 ="D";
the relation table b2c does not use any indexes:
0|0|2|SCAN TABLE b2c
0|1|1|SEARCH TABLE b USING INTEGER PRIMARY KEY (rowid=?)
0|2|0|SEARCH TABLE a USING INTEGER PRIMARY KEY (rowid=?)
0|3|3|SEARCH TABLE c USING INTEGER PRIMARY KEY (rowid=?)
0|4|4|SEARCH TABLE d USING INTEGER PRIMARY KEY (rowid=?)
The query is about two orders of magnitude to slow to be usable. Is there any way to make SQLite use an index on b2c?
Thanks!
In a nested loop join, the outermost table does not use an index for the join (because the database just goes through all rows anyway).
To be able to use an index for a join, the index and the other column must have the same affinity, which usually means that both columns must have the same type.
Change the types of the b2c columns to INTEGER.
If the lookups on a1 or d1 are very selective, using a or d as the outermost table might make sense, and would then allow to use an index for the filter.
Try running ANALYZE.
If that does not help, you can force the join order with CROSS JOIN or INDEXED BY.
A little bit late in the project we have realized that we need to use Oracle (11G) partitioning both for performance and admin of the data.
We have an hierarchical entity model with lots of #OneToMany and #OneToOne relationships, and some entities are referenced from 2 or more other entities.
We want to use "partitioning by range" (month) on the "parent/root" entity and "partition by reference" on all child entities. After a year we will move the oldest month partition to an archive db. It's a 24-7 system so the data continuously grows.
From the Oracle documentation: "Reference partitioning allows the partitioning of two tables related to one another by referential constraints. The partitioning key is resolved through an existing parent-child relationship, enforced by enabled and active primary key and foreign key constraints."
Is it possible to use "partition by reference" on a table when there are 2 foreign keys and one of them can be null?
(From what I read you have to use "not null" on the foreign key for "partition by reference")
A small example to illustrate the problem:
A - parent entity
B - child entity to A
C - child entity to A or B
create table
A (
id number primary key,
adate date
)
partition by range (adate) (
partition p1 values less than (to_date('20130501','yyyymmdd')),
partition p2 values less than (to_date('20130601','yyyymmdd')),
partition pm values less than (maxvalue)
);
create table
B (
id number primary key,
text varchar2(5),
a_id number not null,
constraint fk_ba foreign key (a_id) references A
)
partition by reference(fk_ba);
create table
C (
id number primary key,
text varchar2(5),
a_id number not null, -- NOT POSSIBLE as a_id or b_id will be null..
b_id number not null, -- NOT POSSIBLE as a_id or b_id will be null..
constraint fk_ca foreign key (a_id) references A,
constraint fk_cb foreign key (b_id) references B
)
partition by reference(fk_ca)
partition by reference(fk_cb);
Thanks for any advice.
/Mats
You cannot partition by two foreign keys.
If A is parent of B and B is parent of C i would suggest partitioning C by fk_cb. weather you'l gain max pruning when joining A and C - thats an interesting question, why wont you run a test for us ?
question - why you have FK of A in table C. isn't A implied by the fk to B ?
(my guess, technically it's possible but oracle would have to access table B. i dont think he will do that so i think you wont get prunning).