Google datastore - index a date created field without having a hotspot - sorting

I am using Google Datastore and will need to query it to retrieve some entities. These entities will need to be sorted by newest to oldest. My first thought was to have a date_created property which contains a timestamp. I would then index this field and sort on this field. The problem with this approach is it will cause hotspots in the database (https://cloud.google.com/datastore/docs/best-practices).
Do not index properties with monotonically increasing values (such as a NOW() timestamp). Maintaining such an index could lead to hotspots that impact Cloud Datastore latency for applications with high read and write rates.
Obviously sorting data on dates is properly the most common sorting performed on a database. If I can't index timestamps, is there another way I can accomplish being able to sort my queires from newest to oldest without hotspots?

As you note, indexing monotonically changed values doesn't scale and can lead to hotspots. Whether you are potentially impacted by this depends on your particular usage.
As a general rule, the hotspotting point of this pattern is 500 writes per second. If you know you're definitely going to stay under that you probably don't need to worry.
If you do need higher than 500 writes per second, but have a upper limit in mind, you could attempt a sharded approach. Basically, if you upper on writes per second is x, then n = ceiling(x/500), where n is the number of shards. When you write your timestamp, prepend random(1, n) at the start. This creates n random key ranges that each can perform up to 500 writes per second. When you query your data, you'll need to issue n queries and do some client side merging of the result streams.

Related

Partition index to reduce buffer busy waits?

From time to time our Oracle response times decrease significally for a minute or two, without having extra load.
we were able to identify an insert statement, which produces a lot of buffer busy waits.
From the ADDM report, we got the following hint:
Consider partitioning the INDEX "IDX1" with object
ID 4711 in a manner that will evenly distribute concurrent DML across
multiple partitions.
To be honest: I am not sure what that means. I don't know what a partitioned index is. I only can Image that it means to create a Partition with a local index.
Can you help me out here?
There is a very high frequency of reading and writing to that table. no updates or deletes are used.
Thanks,
E.
I am not sure what that means.
Oracle is telling you that there is a lot of concurrent ("at the same time") activity on a very small part of your index. This happens a lot.
Consider an index column TAB1_PK on table TAB1 whose values are inserted from a sequence TAB1_S. Suppose you have 5 database sessions all inserting into TAB1 at the same time.
Because TAB1_PK is indexed, and because the sequence is generating values in numeric order, what happens is that all those sessions have to read and update the same blocks of the index at the same time.
This can cause a lot of contention -- way more than you would expect, due to the way indexes work with multi-version read consistency. I mean, in some rare situations (depending on how the transaction logic is written and the number of concurrent sessions), it can really be crippling.
The (really) old way to avoid this problem was to use a reverse key index. That way, the sequential column values did not all go to the same index blocks.
However, that is a two-edged sword. On the one hand, you get less contention because you're inserting all over the index (good). On the other hand, your rows are going all over the index, meaning you cannot cache them all. You've just turned a big logical I/O problem into a physical I/O problem!
Nowadays, we have a better solution -- a GLOBAL HASH PARTITION on the index.
With a GHP, you can specify the number of hash buckets and use that to trade-off between how much contention you need to handle vs how compact you want the index updates (for better buffer caching). The more index hash partitions you use, the better your concurrency but the worse your index block buffer caching will be.
I find a number (of global hash partitions) around 16 is pretty good.

How to design a system in which we can query top results in last n hours

I was asked this question in an interview. The details were that assume we are getting millions of events. Each event has a timestamp and other details. The systems design requires ability to enable end user to query most frequent records in last 10 minutes or 9 hours or may be 3 months.
Event can be seen as following
event_type: {CRUD + Search}
event_info: xxx
timestamp : ts...
The easiest way to to figure out this is to look at how other stream processing or map reduce libraries do this (and I have feeling your interviewers have seen these libraries). Its basically real time map reduce (you can lookup how that works as well).
I will outline two techniques for event processing. In reality most companies need to do both.
New school Stream processing (real time)
Lets assume for now they don't want the actual events but the more likely case of aggregates (I think that was the intent of your question)
An example stream processing project is pipelinedb (they have how it works on the bottom of their home page).
Events go into use a queue/ring buffer
A worker process reads those events in batches and rolls them up into partial buckets or window.
Finally there is combiner or reducer which takes the micro batches and actually does the updating. An example would be event counts. Because we are using a queue from above events come in ordered and depending on the queue we might be able to have multiple consumers that do the combing operation.
So if you want minute counts you would do rollups per minute and only store the sum of the events for that minute. This turns out to be fairly small space wise so you can store this in memory.
If you wanted those counts for month or day or even year you would just add up all the minute count buckets.
Now there is of course a major problem with this technique. You need to know what aggregates and pivots you would like to collect a priori.
But you get extremely fast look up of results.
Old school data warehousing (partitioning) and Map Reduce (batch processed)
Now lets assume they do want the actual events for a certain time period. This is expensive because if you store all the events in one place the lookup and retrieval is difficult. But if you use the fact that time is hierarchal you can store the events in a tree of tuples.
Reasons you would want the actual events is because you are doing adhoc querying and are willing to wait for the queries to perform.
You need some sort of queue for the stream of events.
A worker reads the queue and partitions the events based on time. For example you would have a partition for a certain day. This is akin to sharding. Many storage systems have support for this (e.g postgres partitions).
When you want a certain number of events over a period you union the partitions.
The partitioning is essentially hierarchal (minutes < hours < days etc) which means you can do tree like operations on them.
There are certain ways to store such events which is called time series data such that the partitioning index is automatic and fast. These are called TSDBs of which you can google for more info.
An example TSDB product would be influxdb.
Now going back to the fact that time (or at least how humans represent it) is organized tree like we can we can preform parallelization operations. This because a tree is DAG (directed acyclic graph). With a DAG you can do some analysis and basically recursively operate on the branches (also known as fork/join).
An example generic parallel storage product would citusdb.
Now of course this method has a massive draw back. It is expensive! Even if you make it fast by increasing the number of nodes you will have to pay for those nodes (distributed shards). An in theory the performance should scale linearly but in practice this does not happen (I will save you the details).
I think you will need to persist the data to the disk as
the query duration is super vague, and data might be loss due to some unforeseen circumstances like process killed, machine failure etc.
you can't keep all the events in memory due to memory
constraints(millions of events)
I would suggest using mysql as the data store with taking timestamp as one of the index key. But two events might have same timestamp. So make a composite index key with auto-increment id + timestamp.
Advantages of Mysql:
Super-reliable with replication
Support all kinds of CRUD operations and queries
On each query you can basically get the range of the timestamps as per your need.
First count the no. of events satisfying the query.
select count(*) from `events` where timestamp >= x and timestamp <=y.
If too many events satisfy the query, query them in batches.
select * from 'events' where timestamp >= x and timestamp <=y limit 1000 offset 0;
select * from 'events' where timestamp >= x and timestamp <=y limit 1000 offset 1000;
and so on.. till offset <= count of events matching the first query.

Parse, replacing large (several thousands) number of records

I've got a class in parse with 1-4k records per user. This needs to be replaced from time to time (actually these are records representing multiple timetables).
The problem I'm facing that deleting and inserting these records is a ton of requests. Is there maybe a method to delete and insert a bunch of records, that counts as one request? Maybe it's possible from Cloud Code?
I tried compacting all this data in one record, but then I faced the size limit for records (128 KB). Using any sub format(like a db or file onside a record) would be really tedious, cause the app is targeting nearly all platforms supported by Parse.
EDIT
For clarification, the problem isn't the limit on saveAll/destroyAll. My problem is facing the req/s limit (or rather, as docs state req/min).
Also, I just checked that requests from Cloud Code also seem to count towards that limit.
Well, a possible solution would be also to redesing my datasets and use Array columns or something, but I'd rather avoid it if possible.
I think you could try Parse.Object.saveAll which batch processes the save() function.
Docs: https://www.parse.com/docs/js/api/symbols/Parse.Object.html#.saveAll
Guide: https://parse.com/questions/parseobjectsaveall-performances
I would use a saveAll/DestroyAll (or DeleteAll?) and anything -All that parse provides in its SDK.
You'd still reach a 1000 objects limit, but to counter that you can loop using the .skip property of a request.
Set a limit of 1000 and skip of 0, do the query, then increase the skip value by the previous limit, and so on. And you'd have 2 or 3 requests of a size of 1000 each time. You stop the loop when your results count is smaller than your limit. If it's not, then you query again and set the skip to the limit x loopcount.
Now you say you're facing size issues, maybe you can reduce that query limit to, say, 400, and your loop would just run for longer until your number of results is smaller than your limit (and then you can stop querying/limiting/skipping/looping or anything in -ing).
Okay, so this isn't an answer to my question, but it's a solution to my problem, so I'm posting it.
My problem was storing and then replacing a large amount of small records which add up to significant size (up to 500KB JSON [~1.5MB XML] in my current plans).
So I've chosen a middle path - I implemented sort of vertical partitions.
What I have is a master User record which holds array of pointers to other class (called Entries). Entries have only 2 fields - ID of school record and data which is type Array.
I decided to split "partitions" every 1000 records, which is about ~60-70KB per record, but in my calculations may go up to ~100KB.
I also made field names in json 1 letter, cause every letter in 1000 records is like 1 or 2 KB, depending on encoding.
Actually that approach made PHP code like twice as fast and there is a lot less usage on network and remote database (1000 times less inserts/destroys basically).
So, that is my solution, if anybody has any other ideas, please post it as answer here, cause probably I'm not the only one with such problem and that certainly isn't the only solution.

Smart chunking from a huge table

I have a huge table in a data warehouse (Vertica). I am accessing this table in chunks for optimization purposes. The way I am deciding my chunks is pretty straightforward. I have a primary key column say A and I take a MAX(A). I have a chunk size of 20000 and I have now created (A/20000)+1 chunks. I frame query for each chunk and retrieve the data .
There problem with this approach is as follows:
My number of chunks is dependent on MAX(A) and MAX(A) is growing very fast and thereby my number of chunks increases with it as well.
I have decided on number 20000 because that is what gives me optimal performance. But distribution of primary key within the chunks of 20000 is so scattered. For example the 0-20000 might contain only 3 elements and range 20000-40000 might contain 500 elements and no ranges come close to 20000.
I am trying to figure whether there are any good approximation algorithm for this problem which minimizes the number of chunks and fill in close to 20000 primary keys in one chunk.
Any pointers towards the solution is appreciated.
I'm not sure what optimization purposes means, but I think the best approach would be to create a timestamp column, or use an eligible timestamp column to partition on. You could then partition on a larger frame of reference so there isn't a wide range between the partitions.
If the table is partitioned, it will be able to benefit from partition pruning. This means that Vertica can eliminate the storage containers during query execution which do not match on the timestamp predicate.
Otherwise, you can look at the segmentation clause and use the max/min from the storage containers. This could be slightly more complicated.

Performance with sequentially increasing primary key

Looking for guidance on selecting a database provider for a specific key pattern.
The only key field will be a pre-allocated unique sequentially-increasing number.
During each day between
50 and 100 thousand items will be added,
processed (updated), and then retained for a week or so,
after which usually the lowest-numbered records will be deleted. The number of
records will not fluctuate by very much from day to day but may drop at weekends.
The numbers will probably wrap back to 1 after 100M or so.
I need to find a database implementation where the efficiency of the index lookup,
addition and deletion remains constant. Should I be worried that the performance might drop off as the key value range moves continuously upwards?
index lookup, addition and deletion remains constant
You could ensure it remains constant by rebuilding the indexes every insert (just constantly really slow - no performance drop off at all :)), or close to constant by running index maintenance every hour/day etc.
that the performance might drop off as the key value range moves continuously upwards?
As long as you've got an index, it should be logN performance - e.g. having 1,000,000 rows will be around half the speed of 1,000 rows (when searching for an indexed value). (1,000,000,000,000 will be half that speed again).
So no, you shouldn't need to worry about performance.
The numbers will probably wrap back to 1 after 100M or so.
Ok - if you want. Generally, no need really - just use a big int.
As always with performance: test what you want to do. Make a script that inserts 10,000,000 rows, and see what happens.
My point here being that if you're going to wrap ids at 100M records, the worst you can do is actually have them all allocated. This would represent the fragmented index condition as well (where you only have say 100K records, but they're distributed in a space of 10M) - but you will do index/database maintenance right?

Resources