Can shard key in MemSQL have NULLs? - sharding

What are the rules regarding shard key and key for clustered columnStore ?
I need to make a column as Shard key and also for Clustered columnStore, but it may contain Nulls
What will be the impact of keeping a Nullable column as Shard key ?
I have already tested out the data load using this column and on a high-level, everything looks good for the first batch, but will it break anything while writing or reading down the line ?
CREATE TABLE test (
name varchar(25) DEFAULT NULL,
ID int(11) DEFAULT NULL,
update_date date DEFAULT NULL,
SHARD KEY (update_date) USING CLUSTERED COLUMNSTORE
)

NULL values are allowed in shard keys and columnstore keys, and function like any other value - so if you define shard key on a column with some NULLs then all null values will be placed on the same partition.

Related

SQL datatype to store 8000 characters in a column and should also be a primary key

I need a column which should store more than 4000 characters(to be exact: between 4000 to 8000). And I want that particular column to be a primary key. Since I have dependency in UI end for that particular column to be primary key.
I tried CLOB and varchar(max) but those datatypes doesn't allow primary key property.
Can anyone help me with a proper datatype which can satisfy both conditions?
I'm using Oracle SQL developer 18.3.0.277
You cannot specify a primary key or unique constraint on a CLOB column. If you try then you get the exception:
ORA-02329: column of datatype LOB cannot be unique or a primary key
You can have a composite primary key with two VARCHAR2(4000) columns:
CREATE TABLE table_name (
value_start VARCHAR2(4000),
value_end VARCHAR2(4000),
CONSTRAINT table_name__value__pk PRIMARY KEY (value_start, value_end)
);
Or, from Oracle 12, you can set the system setting MAX_STRING_SIZE to be extended and can use:
CREATE TABLE table_name (
value_start VARCHAR2(8000) PRIMARY KEY
);
However
I would question the design choice of using a VARCHAR2(8000) column as a primary key as you will need to duplicate the (large) column value in all the referential constraints. Instead, make the value UNIQUE and provide a IDENTITY column which can act as the primary key with a one-to-one correspondence to the large string values.

MemSQL: Load data with "skip duplicate key" option is extremely slow

I am evaluating the loading performance of Singlestore 7.6.10.
I tested two ways of loading both are important to real world practice:
loading to skip duplicated primary keys
load data local infile '/opt/orders.tbl' skip duplicate key errors into table ORDERS fields terminated by '|' lines terminated by '|\n' max_errors 0;
loading to replace duplicated primary keys with latest records
load data local infile '/opt/orders.tbl' replace into table orders_sf1_col columns terminated by '|';
Before running the tests, I guessed both methods should have similar performance in terms of load time because both ways need to scan the primary key to lookup duplicated data. If there is any difference, probably the REPLACE method should take more time because it needs to delete the current record and insert the latest one for replacement.
But to my surprise, loading with SKIP runs extremely slow and finished to load 163MB data file in almost 8 minutes. But the REPLACE loading with same file to same table can be finished in less than 15 seconds.
Both tests are run on same test environment (3 VMs) with same data file and load into the same target table. To simulate the duplicated conflicts, I ran two consecutive loads to an empty table and only measure the last one.
Question is why using skip duplicate key errors performs so slow and if there is a better way to achieve the same effect?
The DDL is here:
CREATE TABLE `orders_sf1_col` (
`O_ORDERKEY` int(11) NOT NULL,
`O_CUSTKEY` int(11) NOT NULL,
`O_ORDERSTATUS` char(1) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
`O_TOTALPRICE` decimal(15,2) NOT NULL,
`O_ORDERDATE` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00.000000',
`O_ORDERPRIORITY` varchar(15) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
`O_CLERK` varchar(15) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
`O_SHIPPRIORITY` int(11) NOT NULL,
`O_COMMENT` varchar(79) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
`O_NOP` varchar(79) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
UNIQUE KEY `PRIMARY` (`O_ORDERKEY`) USING HASH,
KEY `ORDERS_FK1` (`O_CUSTKEY`) USING HASH,
KEY `ORDERS_DT_IDX` (`O_ORDERDATE`) USING HASH,
SHARD KEY `__SHARDKEY` (`O_ORDERKEY`) USING CLUSTERED COLUMNSTORE
) AUTOSTATS_CARDINALITY_MODE=INCREMENTAL AUTOSTATS_HISTOGRAM_MODE=CREATE AUTOSTATS_SAMPLING=ON SQL_MODE='STRICT_ALL_TABLES'
Thanks
Skip is more resource intensive function because it utilizes clustered index scan that's why it was taking more time.
On the other hand,
Replace utilizes less resources of the server because it uses clustered index seek
Which reduces the execution time with a noticeable difference.
But Singlestore latest version (7.8) has better results please go thru the official documentation.

Oracle Reference Partition - How to make sure I'm getting the benefits when I select

I'm trying to make sure I'm getting the benefit of selecting from a partition when using reference partitions.
In normal partitions, I know you have to include the column(s) on which the partition is defined in order for Oracle to know it can just search one specific partition.
My question is, when I'm selecting from a reference-partitioned table, do I just need to include the column on which the reference foreign key is defined? Or do I need to join and include the parent table's column on which the partition is actually defined?
create table alpha (
name varchar2(240) not null,
partition_no number(14) not null,
constraint alpha_pk
primary key (name),
constraint alpha_c01
check (partition_no > 0)
)
partition by range(partition_no)
interval (1)
(partition empty values less than (1))
;
create table beta (
name varchar2(240) not null,
alpha_name varchar2(240) not null,
some_data number not null,
constraint beta_pk
primary key (name),
constraint beta_f01
foreign key (alpha_name)
references alpha (name)
)
partition by reference (beta_f01)
;
Assume the tables in production will have much more data in them, with hundreds of millions of rows in the beta table, but merely thousands per partition.
Is this all I need?
select b.some_data
from beta b
where b.alpha_name = 'Blah'
;
Thanks if anyone can verify this for me. Or can explain anything else I'm missing with regard to properly creating indexes in reference-partitioned tables.
[Edit] Removed part of the example where clause that shouldn't have been there. The example is meant to represent reading the reference-partitioned with just the reference partition foreign key in the where clause.

Is primary key by default indexed in oracle

I have a table with a long value as primary key.
Now i think that oracle by default will create a index on it.And i dont need to
create a index explicityly.
The question is :Is primary key by default indexed by oracle in this case?
Yes, a primary key (or any unique column constraint) will create an index, if there is not already one present.
This is the case for almost all databases. Otherwise the uniqueness constraint cannot be efficiently enforced.

Case-insensitive primary key in Oracle

The semantic of our data is case insensitive, so we configure the oracle sessions to be case insensitive:
alter session set NLS_COMP=LINGUISTIC;
alter session set NLS_SORT=BINARY_AI;
Then, to take advantage of indexes we would also want the primary key to be case insensitive as well:
create table SCHEMA_PROPERTY (
NAME nvarchar2(64) not null,
VALUE nvarchar2(1024),
constraint SP_PK primary key (nlssort(NAME))
)
However, this runs into "ORA-00904: : invalid identifier", so I assume it is not possible to use the nlssort() function in the PK definition.
Next attempt was to associate a case-insensitive unique index to the primary key:
create table SCHEMA_PROPERTY (
NAME nvarchar2(64) primary key using index (
create unique index SP_UQ on SCHEMA_PROPERTY(nlssort(NAME))),
VALUE nvarchar2(1024)
);
but this failed too:
Error: ORA-14196: Specified index cannot be used to enforce the constraint.
14196. 00000 - "Specified index cannot be used to enforce the constraint."
*Cause: The index specified to enforce the constraint is unsuitable
for the purpose.
*Action: Specify a suitable index or allow one to be built automatically.
Should I just conclude that Oracle does not support case-insensitive semantics for a PK constraint? This works fine in MSSQL which has a simpler approach in dealing with collations.
We could, of course, create a unique index instead of the primary key, but I wanted to make sure first that the normal way to do this is not supported.
Our oracle version is 11.2.0.1.
As you are on 11.2 you can use a virtual column to achieve this:
CREATE TABLE SCHEMA_PROPERTY (
REAL_NAME nvarchar2(64) not null,
NAME generated always as (lower(real_name)) primary key,
VALUE nvarchar2(1024)
);
Create a unique index to enforce case-insensitive PK:
create table SCHEMA_PROPERTY (
NAME nvarchar2(64),
VALUE nvarchar2(1024),
constraint SP_PK primary key (NAME)
);
create unique index SP_UN on SCHEMA_PROPERTY(lower(NAME));

Resources