Are DynamoDB global secondary indexes sharded based on their own partition key? - performance

I'm a little confused about what I'm reading in the DynamoDB help when it comes to their recommendations for partition key design.
On this first page discussing distributing workload they explain that it's important to have load distributed across distinct partition keys, since these are used for physical sharding. Makes perfect sense.
But then when explaining global secondary indexes they proceed to doing exactly the opposite in all the examples provided:
in this highscore example they create a partition key on a key that has a single value! Doesn't this mean that all requests for the high scores (which are "frequently queried" per the problem definition) will hit the same shard?
in the GSI overloading example, they suggest creating a GSI that uses the table's sort key as its partition key, then performing searches e.g. by Employee_Name - but Employee_Name is a partition key in the GSI, so again wouldn't all these requests hit the same shard?
Aren't these examples going to create hot partitions on the GSI and thus suffer from scaling issues? Or am I misunderstanding something?

I think you're right and these examples are indeed wrong for exactly the reasons you cited.
I think in the GSI overloading example, they switched the partition and sort key around - I opened an issue about that many months ago explaining why I think that - see https://github.com/awsdocs/amazon-dynamodb-developer-guide/issues/202 - but so far it wasn't fixed (or nobody came back to explain to me why the example was right).

Related

Oracle table and index partitioning - risk and disadvantages

I have large unpartitioned tables in the database (100GB+), and to be able to improve performance I think about partitioning them, or maybe just indexes. Data comes in on regularly basis, and is selected by dates, so I think range partitioning by month of creation date would be good opion.
I am reading about oracle table and index partitioning, and it look quite promising.
But I have two questions, for which I can not find answers (I think my google skills are going down).
First one is:
What are risk and disadvantages of creating partitioned tables and indexes in oracle, in particular on such large and alive tables? Is there something that I should know about?
Second:
How to create partition on existing and unpartitioned table or index?
Besides the outage (see below) needed to partition your data, the main risk I see is that if you decide to partition your table and indexes, with local indexes, your performance will not be great for queries not relying on the partition key (date). But you can use global indexes in that case, and go back to similar performances.
The simplest way to create a partitioned table from an unpartitioned one, by far, is to use create table as select with a new name and all the partition storage detail, delete the unpartitioned table and renamed the new table as the old one. Obviously, this requires careful preparation, and an outage that can last a few minutes :)

Efficient creation time descending lookup in Raik

I'm learning to use Raik, the NoSQL engine. Given that I have a user "timeline" with posts, and that post may range from millions to billions, how can I take the last N posts from the raik bucket? I mean, the last created.
I read that when using a Secondary Index Raik will return posts ordered by key. So I decided to use an UUID1 for post keys and to have a Secondary Index for the post author, so that I can take all posts from that author using it's key.
However the posts are sorted ASCENDING! I also want to use the max_results parameter as the SQL LIMIT.
This query however returns the FIRST N posts of that user, not the last. Given that I already saw some StackOverflow posts, and that the proposed solution, MapReduce is not efficient for big buckets, how would you model data or write the query?
Thanks
When coming from a SQL environment it is easy to treat a bucket as a table and store small individual records there, often relying on secondary indexes to get the data out. As Riak is a key-value store that uses consistent hashing, this is however often not the most efficient or scalable approach.
A lookup based on key in Riak allows the partitions holding the data to be directly identified, and the coordinating node can directly query these partitions. When querying a secondary index, Riak does not know on which partitions data that may match the index will reside. It will therefore need to send the query to a large number of partitions in order to ensure that all matching objects can be found. This is known as a 'coverage query' and means that, assuming n_val of 3 is used for the bucket, at least 1/3 of all partitions need to be queried. This generally leads to higher load on the cluster and does not scale as well as direct key lookups. Latencies also tend to be higher.
When using Riak it is therefore often recommended that you structure your data so that you can use direct key lookups as much as possible, e.g. through de-normalization.
If your messages/posts can be grouped some way, e.g. by user or conversation, it may make sense to store them in a single object representing this grouping instead of as separate objects.
If we assume that your posts can consist of either text or images and are linked to a conversation thread, you could create an object representing the conversation thread. This would contain information about the conversation as well as a list of posts. This list of posts can e.g. contain the id of the poster, a timestamp and the key of the record containing the post. If the post is a reasonably short text message it may even contain the entire post, reducing the number of records that will need to be fetched.
As posts come in to this conversation, the record is updated and the list of posts gets longer. It may be wise to set allow_mult to true in order to enable siblings, as this will allow you to handle concurrent writes. This approach allows you to always get the conversation as well as the latest posts through a single direct key lookup.
Riak works best when the size of objects are kept below a couple of MB. You will therefore need to move the oldest posts off to a separate object at some point to keep the size in check. If you keep a list of these related objects in the main conversation object, possibly together with some information about the time interval they cover, you can easily access these through direct key lookup as well if you should need to scroll back over older posts.
As the most common query usually is for the most recent entries, this can always be fulfilled through the main conversation object.
I would also like to point out that we do have a very active mailing list where these kind of issues are discussed quite frequently.
I know it's probably too late to help you, but I found this post through wondering about the same thing. The workaround I have come up with and been using to good effect is to create two secondary indexes, one with the real timestamp, and another that is (MAX_DATE - timestamp). Performing lookups on the first query gets ascending results, and performing lookups on the second query gets descending results (once you do the math to turn it back into a real date). You can find the max date value in the Javascript specification, such as reported in MDN, which is 8640000000000000. I can't speak to how performant it is under really heavy load, but I can tell you that for my purposes it has been blazingly fast and I'm very satisfied. I just came here hoping to find a less hacky way to do it.

How does oracle manage a hash partition

I understand the concept of range partitioning. If i have a date column and i partition on that column based on month, then if my query has a where clause just filtering for a month, then i can hit a particular partition and get my data, without hitting the full table.
In Oracle docs i read that if a logical partitioning like 'month' is not available,(e.g, you partition on a column called customer id) ,then use a hash partitioning. So how will this work? Oracle will randomly divide the data and assign it to different partitions and assign a hash code to each partition?
But in this situation, when new data comes in, how does oracle know in which partition to put the new data? And when i query data, it seems there is no way to avoid hitting multiple partitions?
"how does oracle know in which partition to put the new data?"
From the documentation
Oracle Database uses a linear hashing algorithm and to prevent data
from clustering within specific partitions, you should define the
number of partitions by a power of two (for example, 2, 4, 8).
As for your other question ...
"when i query data, it seems there is no way to avoid hitting multiple
partitions?"
If you're searching for a single Customer ID then no. Oracle's hashing algorithm is consistent, so records with the same partition key end up in the same partition (obviously). But if you are searching for, say, all the new customers from the last month then yes. Oracle's hashing algorithm will strive to distribute records evenly so the latest records will be spread across the whole table.
So the real question is, why do we choose to partition a table? Performance is often the least compelling reason to partition. Better reasons include
availability each partition can reside on a different tablespace. Hence a problem with a tablespace will take out a slice of the table's data instead of the whole thing.
management partitioning provides a mechanism for splitting whole table jobs into clear batches. Partition exchange can make it easier to bulk load data.
As for performance, physical co-location of records can speed up some queries- those which are searching records by a defined range of keys. However, any queries which don't match the grain of the query won't perform faster (and may even perform slower) than a non-partitioned table.
Hash partitioning is unlikely to provide performance benefits, precisely because it shuffles the keys across the whole table. It will provide the availability and manageability benefits of partitioning (but is obviously not particularly amenable to partition exchange).
A hash is not random, it divides the data in a repeatable (but perhaps difficult-to-predict) fashion so that the same ID will always map to the same partition.
Oracle uses a hash algorithm that should usually spread the data evenly between partitions.

Asking for opinions : One sequence for all tables

Here's another one I've been thinking about lately.
We have concluded in earlier discussions : 'natural primary keys are bad, artificial primary keys are good.'
Working with Hibernate earlier I have seen that Hibernate default creates one sequence for all tables. At first I was puzzled by this, why would you do this. But later I saw the advantage that it makes linking parents and children fool proof. Because no tables have the same primary key value, accidentally linking a parent with a table that is not a child gives no results.
Does anyone see any downsides to this approach. I only see one : you cannot have more than 999999999999999999999999999 records in your database.
There could be performance issues with all code getting values from a single sequence - see this Ask Tom thread.
Depending on how sequences are implemented in the database, always hitting the same sequence can be better or worse. When only a few or only one thread request new values, there will be no locking issues. But a bad implementation could cause congestion.
Another problem is rolling back transactions: Sequences don't get rolled back (because someone else might have requested a higher value already), so you can have large gaps which will eat your number space much more quickly than you might expect. OTOH, it will take some time to eat 2 or 4 billion IDs (if you "only" use 32 bit (signed) ints), so it's rarely an issue in practice.
Lastly, you can't easily reset the sequence if you have to. But if you need to have a restarting sequence (say, number of records since midnight), you can tell Hibernate to create/use a second sequence.
A major advantage is that you can uniquely identify objects anywhere in the DB just by the ID. That means you can severely cut down the log information you write in the production system and still find something if you only have the ID.
I prefer having one sequence per table. This comes from one general observation: Some tables ("master tables") have a relatively small row count and have to be kept "forever". For example, the customer table in an ERP.
In other tables ("transaction tables"), many rows are generated perpetually, but after some time, those rows can be archived (or simply deleted). The most extreme example is a tracing table used for debugging purposes; it might grow by hundreds of rows per second, but each row is obsolete after a few days.
Small IDs in the master tables make it easier when working directly on the database, e.g. for debugging purposes.
select * from orders where customerid=415
vs
select * from orders where customerid=89461836571
But this is only a minor issue. The bigger issue is cycling. If you use one sequence for all tables, you simply cannot let it restart. With one sequence per table, you can restart the sequences for the transaction tables when you have archived or deleted the old data. Master tables hardly ever have that problem, since they grow much slower.
I see little value in having only one sequence for all tables. The arguments told so far do not convince me.
There are a couple of disadvantages of using a single sequence:-
reduced concurrency. Handing out the next sequence value involves synchronisation. In practice, I do not think this is likely to be a big problem
Oracle has special code when maintaining btree indexes to detect monotonically increasing values and balance the tree approriately
The CBO might have a better time estimating range queries on the index (if you ever did this) if most values were filled in
An advantage might be that you can determine the order of inserts amongst different tables.
Certainly there are pros and cons to the one-sequence versus one-sequence-per-table approach. Personally I find the ability to assign a truly unique identifier to a row, making each id column a uuid, to be enough of a benefit to outweigh any disadvantages. As Aaron D. succinctly writes:
you can uniquely identify objects anywhere in the DB just by the ID
And, for most applications, due to the way Hibernate3 batches IMPORT statements, this will not be a performance bottleneck unless massive amounts of records are vying for the same db resource (SELECT hibernate_sequence.nextval FROM dual).
Also, this sequence mapping is not supported in the latest release (1.2) of Grails. Though it was supported in Grails 1.1 (!). It now requires subclassing one of the Hibernate dialect classes as a workaround.
For those using Grails/GORM, have a look at this JIRA entry:
Oracle Sequence mappings ignored

Does having several indices all starting with the same columns negatively affect Sybase optimizer speed or accuracy?

We have a table with, say, 5 indices (one clustered).
Question: will it somehow negatively affect optimizer performance - either speed or accuracy of index picks - if all 5 indices start with the same exact field? (all other things being equal).
It was suggested by someone at the company that it may have detrimental effect on performance, and thus one of the indices needs to have the first two fields switched.
I would prefer to avoid change if it is not necessary, since they didn't back up their assertion with any facts/reasoning, but the guy is senior and smart enough that I'm inclined to seriously consider what he suggests.
NOTE1: The basic answer "tailor the index to the where clauses and overall queries" is not going to help me - the index that would be changed is a covered index for the only query using it and thus the order of the fields in it would not affect the IO amount. I have asked a separate SO question just to confirm that assertion.
NOTE2: That field is a date when the records are inserted, and the table is pretty big, if this matters. It has data for ~100 days, about equal # of rows per date, and the first index is a clustered index starting with that date field.
The optimizer has to think more about which if any of the indexes to use if there are five. That cost is usually not too bad, but it depends on the queries you're asking of it. In principle, once the query is optimized, the time taken to execute it should be about the same. If you are preparing SELECT statements for multiple uses, that won't matter much. If every query is prepared afresh and never reused, then the overhead may become a drag on the system performance - particularly if it turns out that it really doesn't matter which of the indexes is actually used for most queries (a moderately strong danger when five indexes all share the same leading columns).
There is also the maintenance cost when the data changes - updating five indexes takes noticably longer than just one index, plus you are using roughly five times as much disk storage for five indexes as for one.
I do not wish to speak for your senior colleague but I believe you have misinterpreted what he said, or he has not expressed himself explicitly enough for you to understand.
One of the things that stand out about poorly designed, and therefore poorly performing tables are, they have many indices on them, and the leading columns of the indices are all the same. Every single time.
So it is pointless debating (the debate is too isolated) whether there is a server cost for indices which all have the same leading columns; the problem is the poorly designed table which exposes itself in myriad ways. That is a massive server cost on every access. I suspect that that is where your esteemed colleague was coming from.
A monotonic column for an index is very poor choice (understood, you need at least one) for an index. But when you use that monotonic column to force uniqueness in some other index, which would otherwise be irrelevant (due to low cardinality, such as SexCode), that is another red flag to me. You've merely forced an irrelevant index to be slightly relevant); the queries, except for the single covered query, perform poorly on anything beyond the simplest select via primary key.
There is no such thing as a "covered index", but I understand what you mean, you have added an index so that a certain query will execute as a covered query. Another flag.
I am with Mitch, but I am not sure you get his drift.
Last, responding to your question in isolation, having five indices with the leading columns all the same would not cause a "performance problem", beyond that which your already have due to the poor table design, but it will cause angst and unnecessary manual labour for the developers chasing down weird behaviour, such as "how come the optimiser used index_1 for my query but today it is using index_4?".
Your language consistently (and particularly in the comments) displays a manner of dealing with issues in isolation. The concept of a server and a database, is that it is a shared central resource, the very opposite of isolation. A problem that is "solved" in isolation will usually result in negative performance impact for everyone outside that isolated space.
If you really want the problem dealt with, fully, post the CREATE TABLE statement.
I doubt it would have any major impact on SELECT performance.
BUT it probably means you could reorganise those indexes (based on a respresentative query workload) to better serve queries more efficiently.
I'm not familiar with the recent version of Sybase, but in general with all SQL servers,
the main (and almost) only performance impact indexes have is with INSERT, DELETE and UPDATE queries. Basically each change to the database requires the data table per-se (or the clustered index) to be updated, as well as all the indexes.
With regards to SELECT queries, having "too many" indexes may have a minor performance impact for example by introducing competing hard disk pages for cache. But I doubt this would be a significant issue in most cases.
The fact that the first column in all these indexes is the date, and assuming a generally monotonic progression of the date value, is a positive thing (with regards to CRUD operations) for it will keep the need of splitting/balancing the index tables to a minimal. (since most inserts at at the end of the indexes).
Also this table appears to be small enough ("big" is a relative word ;-) ) that some experimentation with it to assert performance issues in a more systematic fashion could probably be done relatively safely and easily without interfering much with production. (Unless the 10k or so records are very wide or the query per seconds rate is high etc..)

Resources