Does this BIG monogdb storage cause low performance? - performance

We have a mongodb with 336GB data on it.
Unfortunately there is only 8GB memory on that server.
Is it true to say that this will slow the db down, especially when I try to traverse the entire collection?
What can I do to improve performance?

To get things right, this isn't a "BIG" production setup; it is actually relatively small.
That aside:
Is it true to say that this will slow the db down, especially when I try to traverse the entire collection?
It is true yes. As you iterate the collection MongoDB will need to page in your data, this is true even if you have indexes on the collection.
The exception to this is when you use indexOnly cursors whereby all the data comes only from the index, including the returned document; these are otherwise known as covered queries.
The problem you have here is that your dataset is 42x greater than your RAM amount, assuming you are allowed to use all your RAM (this is not true of course, the OS and other programs will reserve amounts off for themselves). This means that if you expect to iterate the entire collection you will not be able to do it performantly, instead MongoDB could be page thrashing its allocated memory.
What can I do to improve performance?
Get a little more RAM.
You could also try a bit of sharding if getting too much RAM on that one server is a pain.
I would aim for about 20x more data than RAM, that shouldn't be too bad in most cases.

You should index your collection http://docs.mongodb.org/manual/applications/indexes/ to improve performance, but bear in mind that memory is utilised by mongodb when querying indexes so make sure each index you create can fit within the memory you have on your server.
You could also shard your collection but you will need more servers to do this. http://docs.mongodb.org/manual/sharding/
And I know it's obvious but get more memory - its cheap!

Mongodb uses memory-mapped files to map the data in to the systems virtual memory. If you try to access more data than the available memory of the system, the performance will be poor. You'll have to consider other options like sharding, indexing, increasing RAM etc. Indexing may improve the performance but not by much if done on a large data set, because indexes also need memory. A few references:
First 3 questions talk about memory-mapped files: http://docs.mongodb.org/manual/faq/storage/
On sharding: http://docs.mongodb.org/manual/faq/sharding/
Ensuring index fit into the RAM: http://docs.mongodb.org/manual/applications/indexes/#ensure-indexes-fit-ram

The other answers say either "have enough memory to fit your data" or "have enough memory for each index" or "have some multiple of your RAM in data". None of those are very effective nor very precise for capacity planning.
You need to know what your access patterns will be and then decide what indexes you will need to effectively be able to use your data. If all of your indexes fit in available RAM with some room to spare for most recently touched documents, then you should be okay.
When your working set (accessed data + indexes) cannot fit in RAM then your performance will be correlated more with disk access speed than anything else. Depending on how fast your disks are and on your throughput and latency requirements, it may work out okay or it may not.
While there is not enough information to say with certainty whether you will succeed or fail on this particular machine, you should be able to collect enough information to determine that for yourself by analyzing your indexing needs, etc.

Related

Cassandra client code with high read throughput with row_cache optimization

Can someone point me to cassandra client code that can achieve a read throughput of at least hundreds of thousands of reads/s if I keep reading the same record (or even a small number of records) over and over? I believe row_cache_size_in_mb is supposed to cache frequently used records in memory, but setting it to say 10MB seems to make no difference.
I tried cassandra-stress of course, but the highest read throughput it achieves with 1KB records (-col size=UNIFORM\(1000..1000\)) is ~15K/s.
With low numbers like above, I can easily write an in-memory hashmap based cache that will give me at least a million reads per second for a small working set size. How do I make cassandra do this automatically for me? Or is it not supposed to achieve performance close to an in-memory map even for a tiny working set size?
Can someone point me to cassandra client code that can achieve a read throughput of at least hundreds of thousands of reads/s if I keep reading the same record (or even a small number of records) over and over?
There are some solution for this scenario
One idea is to use row cache but be careful, any update/delete to a single column will invalidate the whole partition from the cache so you loose all the benefit. Row cache best usage is for small dataset and are frequently read but almost never modified.
Are you sure that your cassandra-stress scenario never update or write to the same partition over and over again ?
Here are my findings: when I enable row_cache, counter_cache, and key_cache all to sizable values, I am able to verify using "top" that cassandra does no disk I/O at all; all three seem necessary to ensure no disk activity. Yet, despite zero disk I/O, the throughput is <20K/s even for reading a single record over and over. This likely confirms (as also alluded to in my comment) that cassandra incurs the cost of serialization and deserialization even if its operations are completely in-memory, i.e., it is not designed to compete with native hashmap performance. So, if you want get native hashmap speeds for a small-working-set workload but expand to disk if the map grows big, you would need to write your own cache on top of cassandra (or any of the other key-value stores like mongo, redis, etc. for that matter).
For those interested, I also verified that redis is the fastest among cassandra, mongo, and redis for a simple get/put small-working-set workload, but even redis gets at best ~35K/s read throughput (largely independent, by design, of the request size), which hardly comes anywhere close to native hashmap performance that simply returns pointers and can do so comfortably at over 2 million/s.

Strategy for "user data" in couchbase

I know that a big part of the performance from Couchbase comes from serving in-memory documents and for many of my data types that seems like an entirely reasonable aspiration but considering how user-data scales and is used I'm wondering if it's reasonable to plan for only a small percentage of the user documents to be in memory all of the time. I'm thinking maybe only 10-15% at any given time. Is this a reasonable assumption considering:
At any given time period there will be a only a fractional number of users will be using the system.
In this case, users only access there own data (or predominantly so)
Recently entered data is exponentially more likely to be viewed than historical user documents
UPDATE:
Some additional context:
Let's assume there's a user base of a 1 million customers, that 20% rarely if ever access the site, 40% access it once a week, and 40% access it every day.
At any given moment, only 5-10% of the user population would be logged in
When a user logs in they are like to re-query for certain documents in a single session (although the client does do some object caching to minimise this)
For any user, the most recent records are very active, the very old records very inactive
In summary, I would say of a majority of user-triggered transactional documents are queried quite infrequently but there are a core set -- records produced in the last 24-48 hours and relevant to the currently "logged in" group -- that would have significant benefits to being in-memory.
Two sub-questions are:
Is there a way to indicate a timestamp on a per-document basis to indicate it's need to be kept in memory?
How does couchbase overcome the growing list of document id's in-memory. It is my understanding that all ID's must always be in memory? isn't this too memory intensive for some apps?
First,one of the major benefits to CB is the fact that it is spread across multiple nodes. This also means your queries are spread across multiple nodes and you have a performance gain as a result (I know several other similar nosql spread across nodes - so maybe not relevant for your comparison?).
Next, I believe this question is a little bit too broad as I believe the answer will really depend on your usage. Does a given user only query his data one time, at random? If so, then according to you there will only be an in-memory benefit 10-15% of the time. If instead, once a user is on the site, they might query their data multiple times, there is a definite performance benefit.
Regardless, Couchbase has pretty fast disk-access performance, particularly on SSDs, so it probably doesn't make much difference either way, but again without specifics there is no way to be sure. If it's a relatively small document size, and if it involves a user waiting for one of them to load, then the user certainly will not notice a difference whether the document is loaded from RAM or disk.
Here is an interesting article on benchmarks for CB against similar nosql platforms.
Edit:
After reading your additional context, I think your scenario lines up pretty much exactly how Couchbase was designed to operate. From an eviction standpoint, CB keeps the newest and most-frequently accessed items in RAM. As RAM fills up with new and/or old items, oldest and least-frequently accessed are "evicted" to disk. This link from the Couchbase Manual explains more about how this works.
I think you are on the right track with Couchbase - in any regard, it's flexibility with scaling will easily allow you to tune the database to your application. I really don't think you can go wrong here.
Regarding your two questions:
Not in Couchbase 2.2
You should use relatively small document IDs. While it is true they are stored in RAM, if your document ids are small, your deployment is not "right-sized" if you are using a significant percentage of the available cluster RAM to store keys. This link talks about keys and gives details relevant to key size (e.g. 250-byte limit on size, metadata, etc.).
Basically what you are making a decision point on is sizing the Couchbase cluster for bucket RAM, and allowing a reduced residency ratio (% of document values in RAM), and using Cache Misses to pull from disk.
However, there are caveats in this scenario as well. You will basically also have relatively constant "cache eviction" where "not recently used" values are being removed from RAM cache as you pull cache missed documents from disk into RAM. This is because you will always be floating at the high water mark for the Bucket RAM quota. If you also simultaneously have a high write velocity (new/updated data) they will also need to be persisted. These two processes can compete for Disk I/O if the write velocity exceeds your capacity to evict/retrieve, and your SDK client will receive a Temporary OOM error if you actually cannot evict fast enough to open up RAM for new writes. As you scale horizontally, this becomes less likely as you have more Disk I/O capacity spread across more machines all simultaneously doing this process.
If when you say "queried" you mean querying indexes (i.e. Views), this is a separate data structure on disk that you would be querying and of course getting results back is not subject to eviction/NRU, but if you follow the View Query with a multi-get the above still applies. (Don't emit entire documents into your Index!)

Riak performance - unexpected results

In the last days I played a bit with riak. The initial setup was easier then I thought. Now I have a 3 node cluster, all nodes running on the same vm for the sake of testing.
I admit, the hardware settings of my virtual machine are very much downgraded (1 CPU, 512 MB RAM) but still I am a quite surprised by the slow performance of riak.
Map Reduce
Playing a bit with map reduce I had around 2000 objects in one bucket, each about 1k - 2k in size as json. I used this map function:
function(value, keyData, arg) {
var data = Riak.mapValuesJson(value)[0];
if (data.displayname.indexOf("max") !== -1) return [data];
return [];
}
And it took over 2 seconds just for performing the http request returning its result, not counting the time it took in my client code to deserialze the results from json. Removing 2 of 3 nodes seemed to slightly improve the performance to just below 2 seconds, but this still seems really slow to me.
Is this to be expected? The objects were not that large in bytesize and 2000 objects in one bucket isnt that much, either.
Insert
Batch inserting of around 60.000 objects in the same size as above took rather long and actually didnt really work.
My script which inserted the objects in riak died at around 40.000 or so and said it couldnt connect to the riak node anymore. In the riak logs I found an error message which indicated that the node ran out of memory and died.
Question
This is really my first shot at riak, so there is definately the chance that I screwed something up.
Are there any settings I could tweak?
Are the hardware settings too constrained?
Maybe the PHP client library I used for interacting with riak is the limiting factor here?
Running all nodes on the same physical machine is rather stupid, but if this is a problem - how can i better test the performance of riak?
Is map reduce really that slow? I read about the performance hit that map reduce has on the riak mailing list, but if Map Reduce is slow, how are you supposed to perform "queries" for data needed nearly in realtime? I know that riak is not as fast as redis.
It would really help me a lot if anyone with more experience in riak could help me out with some of these questions.
This answer is a bit late, but I want to point out that Riak's mapreduce implementation is designed primarily to work with links, not entire buckets.
Riak's internal design is actually pretty much optimized against working with entire buckets. That's because buckets are not considered to be sequential tables but a keyspace distributed across a cluster of nodes. This means that random access is very fast — probably O(log n), but don't quote me on that — whereas serial access is very, very, very slow. Serial access, the way Riak is currently designed, necessarily means asking all nodes for their data.
Incidentally, "buckets" in Riak terminology are, confusingly and disappointingly, not implemented the way you probably think. What Riak calls a bucket is in reality just a namespace. Internally, there is only one bucket, and keys are stored with the bucket name as a prefix. This means that no matter how small or large you bucket is, enumerating the keys in a single bucket of size n will take m time, where m is the total number of keys in all buckets.
These limitations are implementation choices by Basho, not necessarily design flaws. Cassandra implements the exact same partitioning model as Riak, but supports efficient sequential range scans and mapreduce across large amounts of keys. Cassandra also implements true buckets.
A recommendation I'd have now that some time has passed and several new versions of Riak have come about is this. Never rely on full bucket map/reduce, that's not an optimized operation, and chances are very good there are other ways to optimize your map/reduce so you don't have to look through so much data to pull out the singlets you need.
Secondary indices now available in newer versions of Riak are definitely the way to go in this regard. Put an index on the objects you want to find (perhaps named 'ismax_int' with a value of 0 or 1). You can map/reduce a secondary index with hundreds of thousands of keys in microseconds which a full bucket scan would have taken multiple seconds to consider.
I don't have direct experience of Riak, but have worked with Cassandra a little, which is similar.
Firstly, performance will probably depend a lot on the number of cores available, and the memory. These systems are usually heavily pipelined and concurrent and benefit from a lot of cores. 4+ cores and 4GB+ of RAM would be a good starting point.
Secondly, MapReduce is designed for batch processing, not realtime queries.
Riak and all similar Key-Value stores are designed for high write performance, high read performance for simple lookups, no complex querying at all.
Just for comparison, Cassandra on a single node (6 core, 6GB) can do 20,000 individual inserts per second.

Does it make sense to optimize queries for less i/o pressure?

I have a read only database (product) that recides on its own Sql Server 2008.
I already optimized queries by looking at most expensive queries in activity monitor - report. I ordered the report by CPU-cost. I now have something like 50 queries/second and no query is longer than 300ms.
CPU-Time is ok (30%) and Memory is only used by 20% (out of 64GB).
There is one issue: disk time is at steady 100% (I looked at idle time performance counter and used ideras SQL diagnostic manager). I can see that the product db behaves different than my order db which is on a different machine and has smaller tables: If I look at a profiler trace I have queries in product-db that show a value in column "read" higher than 50.000. In my order DB these values are never higher than 1000. The queries in product-db use a lot of Common table expressions, work on large tables (some are around 5 Million entries).
I am not shure if I should invest time in optimizing queries for i/o performance or if I should just add a server. By otimizing for query duration I already added the missing indexes. Is optimizing for i/o something that is usually done?
In short, yes. Optimize for both CPU and IO.
Queries with high CPU tend to be doing unnecessary in-memory sorts, (sometimes inefficient) hash joins, or complex logic.
Queries with high IO (Page Reads) tend to be doing full table scans or working in other inefficient ways.
9 times out of 10, the same queries will be near the top of the list, but if you've worked on the high CPU and you still are unhappy with performance, then by all means, work on the high IO procs next.
There's always a next bottleneck.
they say.
Now that you've tuned CPU usage, it's only natural that I/O load emerges as dominant. Is your performance already acceptable? If yes stop, if no you have to estimate how many hours you will have to invest in further tuning and if buying another server or more hard disks might be cheaper.
Regarding the I/O tuning again, try to see what you can achieve with easy measures. Sometimes you can trade CPU for I/O and vice versa. Compression is an example for this. You would then tune that component that is your current bottlneck.
Before you seek to make the I/O faster try to reduce the I/O that is generated.
Look for obvious IO performance improvements for your query, but more importantly, look at how you can improve your IO performance at the server level.
If your other resources (CPU and memory) aren't overloaded, you probably don't need a new server. Consider adding an SSD for logs and temp files, and/or consider if you can affordably fit your whole DB onto an array of SSDs.
Of course, clearing out your disk IO bottleneck is likely to raise CPU usage, but if your performance is close to acceptable, this will probably improve things to the point that you can stop optimizing for now.
Unless you are using SSDs or a DB optimized SAN then IO is almost always the limit in database applications.
So yes, optimize to get rid of it as much as possible.
Table indexes are the first thing to do.
Then, add as much RAM as you possibly can, up to the complete size of your DB files.
Then partition your data tables (if that is a reasonable thing to do) so that any necessary table or index scans are done on only one or two table partitions.
Then I suppose you either buy bigger machines with even more RAM and/or buy SSDs or a SAN or a SAN with SSDs.
Alternatively you rebuild your entire database application to use something like NoSQL or database sharding, and implement all your relations, joins, constraints, etc in a middle interface layer.

Which server parameters to tweak in Solr if I expect heavy writes and light reads?

I am facing scalability issues designing a new Solr cluster and I need to master to be able to handle a relatively high rate of updates with almost no reads - they can be done via slaves.
My existing Solr instance is occupying a huge amount of RAM, in fact it started swapping at only 4.5mil docs. I am interested in making the footprint as little as possible in RAM, even if it affects search performance.
So, which Solr config values can I tweak in order to accomplish this?
Thank you.
It's hard to say without knowing the specifics of your enviroment (like the schema, custom indexers, queryfunctions etc...) and whats a huge amount of ram? but you could start by
setting filterCache, queryResultCache and documentCache to 0 in solrconfig.xml. This will severely impact the performance of queries executed in SOLR.
set compression to true TextField and StrField types that you store. Then set compressThreshold to a low integer value. This will decrease the size of the documents at the cost of increased CPU usage. (see http://wiki.apache.org/solr/SchemaXml#head-73cdcd26354f1e31c6268b365023f21ee8796613 for more details
turn off all autowarming queries and don't do any read queries
make sure you commit often enough
obviously these are all things to do on the master not on the slaves.

Resources