What are the pitfalls of Cassandra materialised views and IN queries? - events

Let's say I have a table
CREATE TABLE events (
stream_id text,
type text,
data text,
timestamp timestamp,
PRIMARY KEY (stream_id, timestamp)
);
The request pattern is that I need to get all events by stream_id.
e.g. SELECT * FROM events WHERE stream_id = 'A-1';
Then I need to get all events given a set of types. So I create a MV:
CREATE MATERIALIZED VIEW events_by_type AS
SELECT * FROM events
WHERE type IS NOT NULL AND
timestamp IS NOT NULL
PRIMARY KEY (type, stream_id, timestamp)
WITH CLUSTERING ORDER BY (timestamp DESC);
The request is like
SELECT * FROM events_by_type WHERE type IN ('T1', 'T2);
What are the pitfalls with this query patterns and data model?
If any, is it possible to improve it?

Only pitfall I can think of that you may hit is the consistency with the view is not reflected in the consistency level of the write to the base table. So if you need stronger consistency (quorum on reads & writes) you may run into issues.
One concern is that your partitions are unbounded. On current versions if your building larger than 100mb or so partitions you can start having misc performance issues (works, but will sometimes require GC tuning to keep things moving). This is improving recently but you should break up your partitions some. i.e.
CREATE TABLE events (
stream_id text,
time_bucket text,
type text,
data text,
timestamp timestamp,
PRIMARY KEY ((stream_id, time_bucket), timestamp)
);
CREATE MATERIALIZED VIEW events_by_type AS
SELECT * FROM events WHERE
type IS NOT NULL AND
time_bucket IS NOT NULL AND
timestamp IS NOT NULL
PRIMARY KEY ((type, time_bucket), stream_id, timestamp)
WITH CLUSTERING ORDER BY (timestamp DESC);
It adds a little complexity in your time_bucket needs to be known. You can either predefine the buckets to something like daily (ie 2016-10-10 00:00:00) or create a new table that maps the possible time_buckets for a type or stream_id.

This may also be an case where an old fashioned secondary index is a reasonable choice. Assuming there is a reasonably bounded but not tiny number of types a secondary index could work OK. (If there are only a tiny number of different types then you may also run into problems with large partitions in the materialized view.)

Related

Oracle: updating data in referenced partition scenario is taking longer time

I have table partitioned on a column(rcrd_expry_ts) of date type. We are updating this rcrd_expry_ts weekly by another job. We noticed the update query is taking quite longer time (1 to 1.5 min) even for few rows and I think longer time is taken for actually moving data internally to different partitioned. There can be a million of rows eligible to update rcrd_expry_ts by our weekly job.
CREATE TABLE tbl_parent
(
"parentId" NUMBER NOT NULL ENABLE,
"RCRD_DLT_TSTP" timestamp default timestamp '9999-01-01 00:00:00' NOT NULL
)
PARTITION BY RANGE ("RCRD_DLT_TSTP") INTERVAL (NUMTOYMINTERVAL('1','MONTH')) (PARTITION "P1" VALUES LESS THAN (TO_DATE('2010-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS')));
CREATE TABLE tbl_child
(
"foreign_id" NUMBER NOT NULL ENABLE,
"id" NUMBER NOT NULL ENABLE,
constraint fk_id foreign key("foreign_id") references
tbl_parent("parentId")
)partition by reference (fk_id);
I am updating RCRD_DLT_TSTP in parent table from some another job (using simple update query) but I noticed that it took around 1 to 1.5 min to execute, probably due to creating partition and move data into corresponding partition. Is there any better way to achieve this in Oracle
The table has a referenced partitioned child. So any rows moving partition in the parent will have to be cascaded to the child table too.
This means you could be moving substantially more rows that the "few rows" that change in the parent.
It's also worth checking if the update can identify the rows it needs to change faster too.
You can do this by getting the plan for the update statement like this:
update /*+ gather_plan_statistics */ <your update statement>;
select *
from table(dbms_xplan.display_cursor( format => 'ALLSTATS LAST' ));
This will give you the plan for the update with its run time stats. This will help in identifying if there are any indexes you can create to improve performance.
Is there any better way to achieve this in Oracle
This is a question that needs to be answered in the larger context. You may well be able to make this process faster by unpartitioning the table and using indexes to identify the rows to change.
But this affects all the other statements that access this table. To what extent do they benefit from partitioning? If the answer is substantially, is it worth making this process faster at the expense of these others? What trade-offs are you willing to make here?

Query a table in different ways or orderings in Cassandra

I've recently started to play around with Cassandra. My understanding is that in a Cassandra table you define 2 keys, which can be either single column or composites:
The Partitioning Key: determines how to distribute data across nodes
The Clustering Key: determines in which order the records of a same partitioning key (i.e. within a same node) are written. This is also the order in which the records will be read.
Data from a table will always be sorted in the same order, which is the order of the clustering key column(s). So a table must be designed for a specific query.
But what if I need to perform 2 different queries on the data from a table. What is the best way to solve this when using Cassandra ?
Example Scenario
Let's say I have a simple table containing posts that users have written :
CREATE TABLE posts (
username varchar,
creation timestamp,
content varchar,
PRIMARY KEY ((username), creation)
);
This table was "designed" to perform the following query, which works very well for me:
SELECT * FROM posts WHERE username='luke' [ORDER BY creation DESC];
Queries
But what if I need to get all posts regardless of the username, in order of time:
Query (1): SELECT * FROM posts ORDER BY creation;
Or get the posts in alphabetical order of the content:
Query (2): SELECT * FROM posts WHERE username='luke' ORDER BY content;
I know that it's not possible given the table I created, but what are the alternatives and best practices to solve this ?
Solution Ideas
Here are a few ideas spawned from my imagination (just to show that at least I tried):
Querying with the IN clause to select posts from many users. This could help in Query (1). When using the IN clause, you can fetch globally sorted results if you disable paging. But using the IN clause quickly leads to bad performance when the number of usernames grows.
Maintaining full copies of the table for each query, each copy using its own PRIMARY KEY adapted to the query it is trying to serve.
Having a main table with a UUID as partitioning key. Then creating smaller copies of the table for each query, which only contain the (key) columns useful for their own sort order, and the UUID for each row of the main table. The smaller tables would serve only as "sorting indexes" to query a list of UUID as result, which can then be fetched using the main table.
I'm new to NoSQL, I would just want to know what is the correct/durable/efficient way of doing this.
The SELECT * FROM posts ORDER BY creation; will results in a full cluster scan because you do not provide any partition key. And the ORDER BY clause in this query won't work anyway.
Your requirement I need to get all posts regardless of the username, in order of time is very hard to achieve in a distributed system, it supposes to:
fetch all user posts and move them to a single node (coordinator)
order them by date
take top N latest posts
Point 1. require a full table scan. Indeed as long as you don't fetch all records, the ordering can not be achieve. Unless you use Cassandra clustering column to order at insertion time. But in this case, it means that all posts are being stored in the same partition and this partition will grow forever ...
Query SELECT * FROM posts WHERE username='luke' ORDER BY content; is possible using a denormalized table or with the new materialized view feature (http://www.doanduyhai.com/blog/?p=1930)
Question 1:
Depending on your use case I bet you could model this with time buckets, depending on the range of times you're interested in.
You can do this by making the primary key a year,year-month, or year-month-day depending on your use case (or finer time intervals)
The basic idea is that you bucket changes for what suites your use case. For example:
If you often need to search these posts over months in the past, then you may want to use the year as the PK.
If you usually need to search the posts over several days in the past, then you may want to use a year-month as the PK.
If you usually need to search the post for yesterday or a couple of days, then you may want to use a year-month-day as your PK.
I'll give a fleshed out example with yyyy-mm-dd as the PK:
The table will now be:
CREATE TABLE posts_by_creation (
creation_year int,
creation_month int,
creation_day int,
creation timeuuid,
username text, -- using text instead of varchar, they're essentially the same
content text,
PRIMARY KEY ((creation_year,creation_month,creation_day), creation)
)
I changed creation to be a timeuuid to guarantee a unique row for each post creation event. If we used just a timestamp you could theoretically overwrite an existing post creation record in here.
Now we can then insert the Partition Key (PK): creation_year, creation_month, creation_day based on the current creation time:
INSERT INTO posts_by_creation (creation_year, creation_month, creation_day, creation, username, content) VALUES (2016, 4, 2, now() , 'fromanator', 'content update1';
INSERT INTO posts_by_creation (creation_year, creation_month, creation_day, creation, username, content) VALUES (2016, 4, 2, now() , 'fromanator', 'content update2';
now() is a CQL function to generate a timeUUID, you would probably want to generate this in the application instead, and parse out the yyyy-mm-dd for the PK and then insert the timeUUID in the clustered column.
For a usage case using this table, let's say you wanted to see all of the changes today, your CQL would look like:
SELECT * FROM posts_by_creation WHERE creation_year = 2016 AND creation_month = 4 AND creation_day = 2;
Or if you wanted to find all of the changes today after 5pm central:
SELECT * FROM posts_by_creation WHERE creation_year = 2016 AND creation_month = 4 AND creation_day = 2 AND creation >= minTimeuuid('2016-04-02 5:00-0600') ;
minTimeuuid() is another cql function, it will create the smallest possible timeUUID for the given time, this will guarantee that you get all of the changes from that time.
Depending on the time spans you may need to query a few different partition keys, but it shouldn't be that hard to implement. Also you would want to change your creation column to a timeuuid for your other table.
Question 2:
You'll have to create another table or use materialized views to support this new query pattern, just like you thought.
Lastly if your not on Cassandra 3.x+ or don't want to use materialized views you can use Atomic batches to ensure data consistency across your several de-normalized tables (that's what it was designed for). So in your case it would be a BATCH statement with 3 inserts of the same data to 3 different tables that support your query patterns.
The solution is to create another tables to support your queries.
For SELECT * FROM posts ORDER BY creation;, you may need some special column for grouping it, maybe by month and year, e.g. PRIMARY KEY((year, month), timestamp) this way the cassandra will have a better performance on read because it doesn't need to scan the whole cluster to get all data, it will also save the data transfer between nodes too.
Same as SELECT * FROM posts WHERE username='luke' ORDER BY content;, you must create another table for this query too. All column may be same as your first table but with the different Primary Key, because you cannot order by the column that is not the clustering column.

Cassandra data model

I am a cassandra newbie trying to see how I can model our current sql data in cassandra. The database stores document metadata that includes document_id, last_modified_time, size_in_bytes among a host of other data, and the number of documents can be arbitrarily large and hence we are looking for a scalable solution for storage and query.
There is a requirement of 2 range queries
select all docs where last_modified_time >=x and last_modified_time
select all docs where size >= x and size <= y
And also a set of queries where docs needs to be grouped by specific metadata e.g.
select all docs where user in (x,y,z)
What is the best practice of designing the data model based on these queries?
My initial thought is to have a table (in Cassandra 2.0, CQL 3.0) with the last_mod_time as the secondary index as follows
create table t_document (
document_id bigint,
last_mod_time bigint ,
size bigint,
user text,
....
primary key (document_id, last_mod_time)
}
This should take care of query 1.
Do I need to create another table with the primary key as (document_id, size) for the query 2? Or can I just add the size as the third item in the primary key of the same table e.g. (document_id, last_mod_time, size). But in this case will the second query work without using the last_mod_time in the where clause?
For the query 3, which is all docs for one or more users, is it the best practice to create a t_user_doc table where the primary key is (user, doc_id)? Or a better approach is to create a secondary index on the user on the same t_document table?
Thanks for any help.
When it comes to inequalities, you don't have many choices in Cassandra. They must be leading clustering columns (or secondary indexes). So a data model might look like this:
CREATE TABLE docs_by_time (
dummy int,
last_modified_time timestamp,
document_id bigint,
size_in_bytes bigint,
PRIMARY KEY ((dummy),last_modified_time,document_id));
The "dummy" column is always set to the same value, and is sued as a placeholder partition key, with all data stored in a single partition.
The drawback to such a data model is that, indeed, all data is stored in a single partition. There is the maximum of 2 billion cells per partition, but more importantly, a single partition never spans nodes. So this approach doesn't scale.
You could create secondary indexes on a table:
CREATE TABLE docs (
document_id bigint,
last_modified_time timestamp,
size_in_bytes bigint,
PRIMARY KEY ((dummy),last_modified_time,document_id));
CREATE INDEX docs_last_modified on docs(last_modified);
However secondary indexes have important drawbacks (http://www.slideshare.net/edanuff/indexing-in-cassandra), and aren't recommended for data with high cardinality. You could mitigate the cardinality issue somewhat by reducing precision on last_modified_time by, say, only storing the day component.

Oracle Table structure

I have a table in Oracle Database which has 60 columns. Following is the table structure.
ID NAME TIMESTAMP PROERTY1 ...... PROPERTY60
This table will have many rows. the size of the table will be in GBs. But the problem with the table structure is that in future if I have to add a new property, I have to change the schema. To avoid that I want to change the table structure to following.
ID NAME TIMESTAMP PROPERTYNAME PROPERTYVALUE
A sample row will be.
1 xyz 40560 PROPERTY1 34500
In this way I will be able to solve the issue but the size of the table will grow bigger. Will it have any impact on performance in terms on fetching data. I am new to Oracle. I need your suggestion on this.
if I have to add a new property, I have to change the schema
Is that actually a problem? Adding a column has gotten cheaper and more convenient in newer versions of Oracle.
But if you still need to make your system dynamic, in a sense that you don't have to execute DDL for new properties, the following simple EAV implementation would probably be a good start:
CREATE TABLE FOO (
FOO_ID INT PRIMARY KEY
-- Other fields...
);
CREATE TABLE FOO_PROPERTY (
FOO_ID INT REFERENCES FOO (FOO_ID),
NAME VARCHAR(50),
VALUE VARCHAR(50) NOT NULL,
CONSTRAINT FOO_PROPERTY_PK PRIMARY KEY (FOO_ID, NAME)
) ORGANIZATION INDEX;
Note ORGANIZATION INDEX: the whole table is just one big B-Tree, there is no table heap at all. Properties that belong to the same FOO_ID are stored physically close together, so retrieving all properties of the known FOO_ID will be cheap (but not as cheap as when all the properties were in the same row).
You might also want to consider whether it would be appropriate to:
Add more indexes in FOO_PROPERTY (e.g. for searching on property name or value). Just beware of the extra cost of secondary indexes in index-organized tables.
Switch the order of columns in the FOO_PROPERTY PK - if you predominantly search on property names and rarely retrieve all the properties of the given FOO_ID. This would also make the index compression feasible, since the leading edge of the index is now relatively wide string (as opposed to narrow integer).
Use a different type for VALUE (e.g. RAW, or even in-line BLOB/CLOB, which can have performance implications, but might also provide additional flexibility). Alternatively, you might even have a separate table for each possible value type, instead of stuffing everything in a string.
Separate property "declaration" to its own table. This table would have two keys: beside string NAME it would also have integer PROPERTY_ID which can then be used as a FK in FOO_PROPERTY instead of the NAME (saving some storage, at the price of more JOIN-ing).

oracle- index organized table

what is use-case of IOT (Index Organized Table) ?
Let say I have table like
id
Name
surname
i know the IOT but bit confuse about the use case of IOT
Your three columns don't make a good use case.
IOT are most useful when you often access many consecutive rows from a table. Then you define a primary key such that the required order is represented.
A good example could be time series data such as historical stock prices. In order to draw a chart of the stock price of a share, many rows are read with consecutive dates.
So the primary key would be stock ticker (or security ID) and the date. The additional columns could be the last price and the volume.
A regular table - even with an index on ticker and date - would be much slower because the actual rows would be distributed over the whole disk. This is because you cannot influence the order of the rows and because data is inserted day by day (and not ticker by ticker).
In an index-organized table, the data for the same ticker ends up on a few disk pages, and the required disk pages can be easily found.
Setup of the table:
CREATE TABLE MARKET_DATA
(
TICKER VARCHAR2(20 BYTE) NOT NULL ENABLE,
P_DATE DATE NOT NULL ENABLE,
LAST_PRICE NUMBER,
VOLUME NUMBER,
CONSTRAINT MARKET_DATA_PK PRIMARY KEY (TICKER, P_DATE) ENABLE
)
ORGANIZATION INDEX;
Typical query:
SELECT TICKER, P_DATE, LAST_PRICE, VOLUME
FROM MARKET_DATA
WHERE TICKER = 'MSFT'
AND P_DATE BETWEEN SYSDATE - 1825 AND SYSDATE
ORDER BY P_DATE;
Think of index organized tables as indexes. We all know the point of an index: to improve access speeds to particular rows of data. This is a performance optimisation of trick of building compound indexes on sub-sets of columns which can be used to satisfy commonly-run queries. If an index can completely satisy the columns in a query's projection the optimizer knows it doesn't have to read from the table at all.
IOTs are just this approach taken to its logical confusion: buidl the index and throw away the underlying table.
There are two criteria for deciding whether to implement a table as an IOT:
It should consists of a primary key (one or more columns) and at most one other column. (okay, perhaps two other columns at a stretch, but it's an warning flag).
The only access route for the table is the primary key (or its leading columns).
That second point is the one which catches most people out, and is the main reason why the use cases for IOT are pretty rare. Oracle don't recommend building other indexes on an IOT, so that means any access which doesn't drive from the primary key will be a Full Table Scan. That might not matter if the table is small and we don't need to access it through some other path very often, but it's a killer for most application tables.
It is also likely that a candidate table will have a relatively small number of rows, and is likely to be fairly static. But this is not a hard'n'fast rule; certainly a huge, volatile table which matched the two criteria listed above could still be considered for implementations as an IOT.
So what makes a good candidate dor index organization? Reference data. Most code lookup tables are like something this:
code number not null primary key
description not null varchar2(30)
Almost always we're only interested in getting the description for a given code. So building it as an IOT will save space and reduce the access time to get the description.

Resources