Designing relational system for large scale - performance

I've been having some difficulty scaling up the application and decided to ask a question here.
Consider a relational database (say mysql). Let's say it allows users to make posts and these are stored in the post table (has fields: postid, posterid, data, timestamp). So, when you go to retrieve all posts by you sorted by recency, you simply get all posts with posterid = you and order by date. Simple enough.
This process will use timestamp as the index since it has the highest cardinality and correctly so. So, beyond looking into the indexes, it'll take literally 1 row fetch from disk to complete this task. Awesome!
But let's say it's been 1 million more posts (in the system) by other users since you last posted. Then, in order to get your latest post, the database will peg the index on timestamp again, and it's not like we know how many posts have happened since then (or should we at least manually estimate and set preferred key)? Then we wasted looking into a million and one rows just to fetch a single row.
Additionally, a set of posts from multiple arbitrary users would be one of the use cases, so I cannot make fields like userid_timestamp to create a sub-index.
Am I seeing this wrong? Or what must be changed fundamentally from the application to allow such operation to occur at least somewhat efficiently?

Indexing
If you have a query: ... WHERE posterid = you ORDER BY timestamp [DESC], then you need a composite index on {posterid, timestamp}.
Finding all posts of a given user is done by a range scan on the index's leading edge (posterid).
Finding user's oldest/newest post can be done in a single index seek, which is proportional to the B-Tree height, which is proportional to log(N) where N is number of indexed rows.
To understand why, take a look at Anatomy of an SQL Index.
Clustering
The leafs of a "normal" B-Tree index hold "pointers" (physical addresses) to indexed rows, while the rows themselves reside in a separate data structure called "table heap". The heap can be eliminated by storing rows directly in leafs of the B-Tree, which is called clustering. This has its pros and cons, but if you have one predominant kind of query, eliminating the table heap access through clustering is definitely something to consider.
In this particular case, the table could be created like this:
CREATE TABLE T (
posterid int,
`timestamp` DATETIME,
data VARCHAR(50),
PRIMARY KEY (posterid, `timestamp`)
);
The MySQL/InnoDB clusters all its tables and uses primary key as clustering key. We haven't used the surrogate key (postid) since secondary indexes in clustered tables can be expensive and we already have the natural key. If you really need the surrogate key, consider making it alternate key and keeping the clustering established through the natural key.

For queries like
where posterid = 5
order by timestamp
or
where posterid in (4, 578, 222299, ...etc...)
order by timestamp
make an index on (posterid, timestamp) and the database should pick it all by itself.
edit - i just tried this with mysql
CREATE TABLE `posts` (
`id` INT(11) NOT NULL,
`ts` INT NOT NULL,
`data` VARCHAR(100) NULL DEFAULT NULL,
INDEX `id_ts` (`id`, `ts`),
INDEX `id` (`id`),
INDEX `ts` (`ts`),
INDEX `ts_id` (`ts`, `id`)
)
ENGINE=InnoDB
I filled it with a lot of data, and
explain
select * from posts where id = 5 order by ts
picks the id_ts index

Assuming you use hash tables to implement your Data Base - yes. Hash tables are not ordered, and you have no other way but to iterate all elements in order to find the maximal.
However, if you use some ordered DS, such as a B+ tree (which is actually pretty optimized for disks and thus data bases), it is a different story.
You can store elements in your B+ tree ordered by user (primary order/comparator) and date (secondary comparator, descending). Once you have this DS, finding the first element can be achieved in O(log(n)) disk seeks by finding the first element matching the primary criteria (user-id).
I am not familiar with the implementations of data bases, but AFAIK, some of them do allow you to create an index, based on a B+ tree - and by doing so, you can achieve finding the last post of a user more efficiently.
P.S.
To be exact, the concept of "greatest" element or ordering is not well defined in Relational Algebra. There is no max operator. To get the max element of a table R with a single column a one should actually create the Cartesian product of that table and find this entry. There is no max nor sort operator in strict relational algebra (though it does exist in SQL)
(Assuming set, and not multiset semantics):
MAX = R \ Project(Select(R x R, R1.a < R2.a),R1.a)

Related

Oracle Auto Partition strategy on Integer column

I need some help on how to perform auto partition on integer column, similar to how we do on date column like PARTITION BY RANGE (DIM_DT_ID) INTERVAL (NUMTODSINTERVAL(1,'DAY')).
I have 90 million rows and it sucks in performance and our SLA on query is 2 seconds, i would like to perform partition. What is the best approach and how do i enable auto partition on a Integer column
Our query will always filter by these columns like
select * from <tbname>
where ObjectID =1346785
and patentnumber=23456.
"i'm just making an example here, as i cant paste the original query for legality sake"
Fair enough, but the advice we give you will only be as good as the information you give us. So far, nothing you have posted suggests you need Partitioning.
The pasted query would perform well with a compound index, and would probably benefit from compression of the leading column:
create index your_table_lookup_index
on your_table(ObjectID, patentnumber) compress 1;
If that's a unique combination then make the index unique.
how do i enable auto partition on a Integer column
However, if you think you do have a genuine use case for Partitioning then we can use Interval Partitioning with integers as well as dates. This statement will create a table partitioned on objectid with a partition for every ten values.
create table your_table (
objectid number,
patentnumber number,
created_date date
)
partition by range (objectid)
interval (10)
(
partition p_00010 values less than (10)
);
On your posted figures that would be about 400 partitions with around 225000 rows per partition. Is that a good choice? Who can tell? You know your data and your use cases, we don't: perhaps a partition per objectid (i.e. with interval (1)) would be better.
You already have a table so you need to split it into Partitions. The standard of way of doing this would be
create a new table with your partitioning strategy (like above) but with the default partition ranged for values less than (MAXVALUE)
use partition exchange to move the existing table data into the new
structure
drop the old table and rename the table to the old table; resolve
foreign keys and other dependencies.
iteratively split the partition into the required range
This is a fairly time-consuming process. You have tagged your question [oracle12c]; if you're using Oracle 12c R2 you should definitely look at its online conversion mechanism, which is a single command. Find out more.
Remember that Partitioning for performance is a tricky game. While it can improve queries which return a large number of rows aligned with the Partition key it can make no difference to other queries, or even impair their performance. In particular, any query which does not include the partition key (objectid in your case) will likely perform worse after partitioning the table .
Final aside: as you know but for the benefit of future Seekers, Partitioning is a chargeable extra to the Enterprise Edition license. We're not allowed to use it unless we've paid for it.

Bad performance when writing log data to Cassandra with timeuuid as a column name

Following the pointers in an ebay tech blog and a datastax developers blog, I model some event log data in Cassandra 1.2. As a partition key, I use “ddmmyyhh|bucket”, where bucket is any number between 0 and the number of nodes in the cluster.
The Data model
cqlsh:Log> CREATE TABLE transactions (yymmddhh varchar, bucket int,
rId int, created timeuuid, data map, PRIMARY
KEY((yymmddhh, bucket), created) );
(rId identifies the resource that fired the event.)
(map is are key value pairs derived from a JSON; keys change, but not much)
I assume that this translates into a composite primary/row key with X buckets per hours.
My column names are than timeuuids. Querying this data model works as expected (I can query time ranges.)
The problem is the performance: the time to insert a new row increases continuously.
So I am doing s.th. wrong, but can't pinpoint the problem.
When I use the timeuuid as a part of the row key, the performance remains stable on a high level, but this would prevent me from querying it (a query without the row key of course throws an error message about "filtering").
Any help? Thanks!
UPDATE
Switching from the map data-type to a predefined column names alleviates the problem. Insert times now seem to remain at around <0.005s per insert.
The core question remains:
How is my usage of the "map" datatype in efficient? And what would be an efficient way for thousands of inserts with only slight variation in the keys.
My keys I use data into the map mostly remain the same. I understood the datastax documentation (can't post link due to reputation limitations, sorry, but easy to find) to say that each key creates an additional column -- or does it create one new column per "map"?? That would be... hard to believe to me.
I suggest you model your rows a little differently. The collections aren't very good to use in cases where you might end up with too many elements in them. The reason is a limitation in the Cassandra binary protocol which uses two bytes to represent the number of elements in a collection. This means that if your collection has more than 2^16 elements in it the size field will overflow and even though the server sends all of the elements back to the client, the client only sees the N % 2^16 first elements (so if you have 2^16 + 3 elements it will look to the client as if there are only 3 elements).
If there is no risk of getting that many elements into your collections, you can ignore this advice. I would not think that using collections gives you worse performance, I'm not really sure how that would happen.
CQL3 collections are basically just a hack on top of the storage model (and I don't mean hack in any negative sense), you can make a MAP-like row that is not constrained by the above limitation yourself:
CREATE TABLE transactions (
yymmddhh VARCHAR,
bucket INT,
created TIMEUUID,
rId INT,
key VARCHAR,
value VARCHAR,
PRIMARY KEY ((yymmddhh, bucket), created, rId, key)
)
(Notice that I moved rId and the map key into the primary key, I don't know what rId is, but I assume that this would be correct)
This has two drawbacks over using a MAP: it requires you to reassemble the map when you query the data (you would get back a row per map entry), and it uses a litte more space since C* will insert a few extra columns, but the upside is that there is no problem with getting too big collections.
In the end it depends a lot on how you want to query your data. Don't optimize for insertions, optimize for reads. For example: if you don't need to read back the whole map every time, but usually just read one or two keys from it, put the key in the partition/row key instead and have a separate partition/row per key (this assumes that the set of keys will be fixed so you know what to query for, so as I said: it depends a lot on how you want to query your data).
You also mentioned in a comment that the performance improved when you increased the number of buckets from three (0-2) to 300 (0-299). The reason for this is that you spread the load much more evenly thoughout the cluster. When you have a partition/row key that is based on time, like your yymmddhh, there will always be a hot partition where all writes go (it moves throughout the day, but at any given moment it will hit only one node). You correctly added a smoothing factor with the bucket column/cell, but with only three values the likelyhood of at least two ending up on the same physical node are too high. With three hundred you will have a much better spread.
use yymmddhh as rowkey and bucket+timeUUID as column name,where each bucket have 20 or fix no of records,buckets can be managed using counter cloumn family

Sybase - Performance considerations while indexing existing table

I have a table in SYBASE which has around 1mio rows. This table currently does not have any index created and I would like to create one now. My questions are
What precautions should I take before creating an index?
Does this process require more tablespace to be allocated?
Any other performance considerations I should take care of?
Cheers
Ranjith
From manual.
When to index
Use the following general guidelines:
If you plan to do manual insertions into the IDENTITY column, create
a unique index to ensure that the inserts do not assign a value that
has already been used.
A column that is often accessed in sorted order, that is, specified in the order by clause, probably should be indexed so that
Adaptive Server can take advantage of the indexed order.
Columns that are regularly used in joins should always be indexed, since the system can perform the join faster if the columns
are in sorted order.
The column that stores the primary key of the table often has a clustered index, especially if it is frequently joined to columns in
other tables. Remember, there can be only one clustered index per
table.
A column that is often searched for ranges of values might be a good choice for a clustered index. Once the row with the first value
in the range is found, rows with subsequent values are guaranteed to
be physically adjacent. A clustered index does not offer as much of
an advantage for searches on single values.
When not to index
In some cases, indexes are not useful:
Columns that are seldom or never referenced in queries do not benefit
from indexes, since the system seldom has to search for rows on the
basis of values in these columns.
Columns that can have only two or three values, for example, "male" and "female" or "yes" and "no", get no real advantage from
indexes.
Try
sp_spaceused tablename, 1
Here is link to documentation.
Yes - Updating statistics about indexes.
Here is link to documentation.

maintaing a sorted list that is bigger than memory

I have a list of tuples.
[
"Bob": 3,
"Alice": 2,
"Jane": 1,
]
When incrementing the counts
"Alice" += 2
the order should be maintained:
[
"Alice": 4,
"Bob": 3,
"Jane": 1,
]
When all is in memory there rather simple ways (some more or some less) to efficiently implement this. (using an index, insert-sort etc) The question though is: What's the most promising approach when the list does not fit into memory.
Bonus question: What if not even the index fits into memory?
How would you approach this?
B+ trees order a number of items using a key. In this case, the key is the count, and the item is the person's name. The entire B+tree doesn't need to fit into memory - just the current node being searched. You can set the maximum size of the nodes (and indirectly the depth of the tree) so that a node will fit into memory. (In practice nodes are usually far smaller than memory capacity.)
The data items are stored at the leaves of the tree, in so-called blocks. You can either store the items inline in the index, or store pointers to external storage. If the data is regularly sized, this can make for efficient retrieval from files. In the question example, the data items could be single names, but it would be more efficient to store blocks of names, all names in a block having the same count. The names within each block could also be sorted. (The names in the blocks themselves could be organized as a B-tree.)
If the number of names becomes large enough that the B+tree blocks are becoming ecessively large, the key can be made into a composite key, e.g. (count, first-letter). When searching the tree, only the count needs to be compared to find all names with that count. When inserting, or searcing for a specific name with a given count, then the full key can be compared to include filtering by name prefix.
Alternatively, instead of a composite key, the data items can point to offsets/blocks in an external file that contains the blocks of names, which will keep the B+tree itself small.
If the blocks of the btree are linked together, range queries can be efficiently implemented by searching for the start of the range, and then following block pointers to the next block until the end of the range is reached. This would allow you to efficiently implement "find all names with a count between 10 and 20".
As the other answers have noted, an RDBMS is the pre-packaged way of storing lists that don't fit into memory, but I hope this gives an insight into the structures used to solve the problem.
A relational database such as MySQL is specifically designed for storing large amounts of data the sum of which does not fit into memory, querying against this large amount of data, and even updating it in place.
For example:
CREATE TABLE `people` (
`name` VARCHAR(255),
`count` INT
);
INSERT INTO `people` VALUES
('Bob', 3),
('Alice', 2),
('Jane', 1);
UPDATE `people` SET `count` = `count` + 2;
After the UPDATE statement, the query SELECT * FROM people; will show:
+-------+-------+
| name | count |
+-------+-------+
| Bob | 5 |
| Alice | 4 |
| Jane | 3 |
+-------+-------+
You can save the order of people in your table by adding an autoincrementing primary key:
CREATE TABLE `people` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`name` VARCHAR(255),
`count` INT,
PRIMARY KEY(`id`)
);
INSERT INTO `people` VALUES
(DEFAULT, 'Bob', 3),
(DEFAULT, 'Alice', 2),
(DEFAULT, 'Jane', 1);
RDMS? Even flat file versions like SQLite. Otherwise a combination utilizing lazy loading. Only keep X records in memory the top Y records & the Z most recent ones that had counts updated. Otherwise a table of Key, Count columns where you run UPDATEs changing the values. The ordered list can be retrieved with a simple SELECT ORDER BY.
Read about B-trees and B+-trees. With these, the index can always be made small enough to fit into memory.
An interesting approach quite unlike BTrees is the Judy Tree
What you seem to be looking for are out of core algorithms for container classes, specifically an out of core list container class. Check out the stxxl library for some great examples of out of core alogorithms and processing.
You may also want to look at this related question
As far as "implementation details tackling this 'by hand'", you could read about how database systems do this by searching for the original papers on database design or finding graduate course notes on database architecture.
I did some searching and found a survey article by G. Graefe titled "Query evaluation techniques for large databases". It somewhat exhaustively covers every aspect of querying large databases, but the entire section 4 addresses how "query evaluation systems ... access base data stored in the database". Also, Graefe's survey was linked to by the course page for CPS 216: Advanced Databases Systems at Duke, Fall 2001. Week 5 was on Physical Data Organization which says that most commerical DBMS's organize data on-disk using blocks in the N-ary Storage Model (NSM): records are stored from the beginning of each block and a "directory" exists at the end.
See also:
Spring 2004 CPS 216 lecture notes
MIT OCW 6.830 Database Systems
Of course I know I could use a database. This question was more about the implementation details tackling this "by hand"
So basically, you are asking "How does a database do this?" To which the answer is, it uses a tree (for both the data and the index), and only stores part of the tree in memory at any one time.
As has already been mentioned, B-Trees are especially useful for this: since hard-drives always read a fixed amount at a time (the "sector size"), you can make each node the size of a sector to maximize efficiency.
You do not specify that you need to add or remove any elements from the list, just keep it sorted.
If so, a straightforward flat-file approach - typically using mmap for convenience - will work and be faster than a more generic database.
You can use a bsearch to locate the item, or maintain a set of the counts of slots with each value.
As you access an item, so the part of the file it is in (think in terms of memory 'pages') gets read into RAM automatically by the OS, and the slot and it's adjacent slots even gets copied into the L1 cache-line.
You can do an immediate comparison on it's adjacent slots to see if the increment or decrement causes the item to be out-of-order; if it is, you can use a linear iteration (perhaps augmented with a bsearch) to locate the first/last item with the appropriate count, and then swap them.
Managing files is what the OS is built to do.

How to choose and optimize oracle indexes? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I would like to know if there are general rules for creating an index or not.
How do I choose which fields I should include in this index or when not to include them?
I know its always depends on the environment and the amount of data, but I was wondering if we could make some globally accepted rules about making indexes in Oracle.
The Oracle documentation has an excellent set of considerations for indexing choices: http://download.oracle.com/docs/cd/B28359_01/server.111/b28274/data_acc.htm#PFGRF004
Update for 19c: https://docs.oracle.com/en/database/oracle/oracle-database/19/tgdba/designing-and-developing-for-performance.html#GUID-99A7FD1B-CEFD-4E91-9486-2CBBFC2B7A1D
Quoting:
Consider indexing keys that are used frequently in WHERE clauses.
Consider indexing keys that are used frequently to join tables in SQL statements. For more information on optimizing joins, see the section "Using Hash Clusters for Performance".
Choose index keys that have high selectivity. The selectivity of an index is the percentage of rows in a table having the same value for the indexed key. An index's selectivity is optimal if few rows have the same value. Note: Oracle automatically creates indexes, or uses existing indexes, on the keys and expressions of unique and primary keys that you define with integrity constraints.
Indexing low selectivity columns can be helpful if the data distribution is skewed so that one or two values occur much less often than other values.
Do not use standard B-tree indexes on keys or expressions with few distinct values. Such keys or expressions usually have poor selectivity and therefore do not optimize performance unless the frequently selected key values appear less frequently than the other key values. You can use bitmap indexes effectively in such cases, unless the index is modified frequently, as in a high concurrency OLTP application.
Do not index columns that are modified frequently. UPDATE statements that modify indexed columns and INSERT and DELETE statements that modify indexed tables take longer than if there were no index. Such SQL statements must modify data in indexes as well as data in tables. They also generate additional undo and redo.
Do not index keys that appear only in WHERE clauses with functions or operators. A WHERE clause that uses a function, other than MIN or MAX, or an operator with an indexed key does not make available the access path that uses the index except with function-based indexes.
Consider indexing foreign keys of referential integrity constraints in cases in which a large number of concurrent INSERT, UPDATE, and DELETE statements access the parent and child tables. Such an index allows UPDATEs and DELETEs on the parent table without share locking the child table.
When choosing to index a key, consider whether the performance gain for queries is worth the performance loss for INSERTs, UPDATEs, and DELETEs and the use of the space required to store the index. You might want to experiment by comparing the processing times of the SQL statements with and without indexes. You can measure processing time with the SQL trace facility.
There are some things you should always index:
Primary Keys - these are given an index automatically (unless you specify a suitable existing index for Oracle to use)
Unique Keys - these are given an index automatically (ditto)
Foreign Keys - these are not automatically indexed, but you should add one to avoid performance issues when the constraints are checked
After that, look for other columns that are frequently used to filter queries: a typical example is people's surnames.
From the 10g Oracle Database Application Developers Guide - Fundamentals, Chapter 5:
In general, you should create an index on a column in any of the following situations:
The column is queried frequently.
A referential integrity constraint exists on the column.
A UNIQUE key integrity constraint exists on the column.
Use the following guidelines for determining when to create an index:
Create an index if you frequently want to retrieve less than about 15% of the rows in a large table. This threshold percentage varies greatly, however, according to the relative speed of a table scan and how clustered the row data is about the index key. The faster the table scan, the lower the percentage; the more clustered the row data, the higher the percentage.
Index columns that are used for joins to improve join performance.
Primary and unique keys automatically have indexes, but you might want to create an index on a foreign key; see Chapter 6, "Maintaining Data Integrity in Application Development" for more information.
Small tables do not require indexes; if a query is taking too long, then the table might have grown from small to large.
Some columns are strong candidates for indexing. Columns with one or more of the following characteristics are good candidates for indexing:
Values are unique in the column, or there are few duplicates.
There is a wide range of values (good for regular indexes).
There is a small range of values (good for bitmap indexes).
The column contains many nulls, but queries often select all rows having a value. In this case, a comparison that matches all the non-null values, such as:
WHERE COL_X >= -9.99 *power(10,125)
is preferable to
WHERE COL_X IS NOT NULL
This is because the first uses an index on COL_X (assuming that COL_X is a numeric column).
Columns with the following characteristics are less suitable for indexing:
There are many nulls in the column and you do not search on the non-null values.
Wow, that's just such a huge topic, it's hard to answer in this format. I srtongly recommend this book.
Relational Database Index Design and the Optimizers
by Tapio Lahdenmaki
You don't just use indexes to make table access faster, sometimes you make indexes to avoid table access altogether. Something not mentioned yet but vital.
There's a whole science to this if you really want to make your database perform maximally.
Ah, one specific optimization to Oracle is building reverse key indexes. If you have a PK index of a monoatomically increasing value, like a sequence, and you have highly concurrent inserts and don't plan to range scan that column then make it a reverse key index.
See how specific these optimizations can be?
Look into Database Normalization - you'll find a lot of good, industry standard rules about what keys should exist, how databases should be related, and hints on indexes.
-Adam
Usually one puts the ID columns up front and those usually identify the rows uniquely. A combination of columns can also do the same thing. As an example using cars... tags or license plates are unique and qualify for an index. They (the tags column) can qualify for the primary key. The owners name can qualify for an index if you are going to search on name. make of car really shouldn't get an index in the beginning as it's not going to vary too much. Indexes don't help if the data in the column doesn't vary too much.
Take a look at the SQL - what are the where clauses looking at. Those may need an index.
Measure. What is the issue - pages/queries taking too long ? what's being used for the queries. Create an index on those columns.
Caveats: indexes need time for updates and space.
and sometimes full table scans are quicker than an index. small tables can be scanned quicker than getting the index and then hitting the table. Look at your joins.

Resources