Elasticsearch shard allocation for small indices - elasticsearch

I have an elasticsearch setup with 192 active indices ranging from a few hundred mb to possibly 5gb each. I read that for a logstash use case with 1gb indices you should only use 1 shard. The difference with my setup is that I will be having more users (estimate of up to 100) expecting a quick response time. I intend to have 1 replica for reliability.
Will having 1 shard per index still be appropriate for my use case?

In a word: yes.
The need to create multiple primary shards derives from the need to isolate documents, extreme counts (e.g., when you're in the billions of documents volume), or to improve write throughput (write documents across more places, thereby reducing individual burden).
In practice, you want to shard based on your use case, unless you're one of those first two scenarios (isolation or extreme counts).
Are you read heavy?
Are you write heavy? (Less common, but it does happen)
If you're read heavy, as most use cases are, then having fewer shards will help you by limiting the request size (fewer places to look). Given that your shard sizes are also relatively small (I'd consider anything under 5 GB to be relatively small), you can easily get away with having a single primary shard and it should benefit your search performance by doing so.
Indexes that share the same mappings, but are also tiny ("few hundred MBs"), should likely be combined if you search across them. If they're independent, then it really makes no difference and the isolation sounds like good practice at the expense of slightly bloating your cluster state (with each index).

Have a look at this blog: https://qbox.io/blog/optimizing-elasticsearch-how-many-shards-per-index. He has a lot of good pointers to sharding and shard sizing.
However, the question you really should be asking yourself is: How easy is it to change? When it comes to sizing and scalability, the answer often is "it depends" - and the real question is: How quickly can you reconfigure?
This could e.g. mean that you design you application in a way, that allows quick re-spooling of data into a new index, that you use aliases so that you can in fact change these things, where your data lies (not just in Elastic, I hope) etc.
By building a system - from the start - so that you can quickly rebuild indicies enables you to experiment with sizes - and more importantly - change them as your need changes.

Related

Reason behind “Index creation no longer defaults to five shard”

What was the reasoning behind ""Index creation no longer defaults to five shard but one shard"
So far, the assumption was , more shards = more scalability = more parallelism
Isnt that change defeating the whole purpose of distributed systems like ES ?
Yeah, you can relate to more shards= more scalability = more parallelism but this only happens when this is only useful when these shards utilize the multi-cores or more machines(data-nodes) in the cluster.
This is the default config, which is created for the basic workloads and obviously needs more fine-tuning for the advance use cases, which is the sole purpose of making it extensible, it's very difficult to design the perfect Elasticsearch cluster and as it depends on various factors, Elasticsearch tends to provides some default values which works more for general use-cases.
Either you start with a modest workload and then gradually your workload tends to increase, or you start with the huge workload in the begining itself(in which case, any way you will have more shards to get the benefit listed in the first line and this is for advanced use-case).
But first use is more common and the beauty of Elasticsearch is that with little knowledge you can get started and these default settings work quite well for modest workload and oftentimes you don't have to change them and even don't have to understand them in details.
Having more number of shards for a small number of documents with huge search traffic created issues(creation of 5 threads for a single search as default shards were 5) and this is the common use for most of the basic and modest applications out there.
So it makes sense to change the default shards to 1 as its more common use-case and beyond that any way you need to go deep to scale your cluster which would require fine-tuning Elasticsearch further.

Elasticsearch - Sharding and Performance

I think I've finally gotten a grasp of the fundamental understanding of how to allocate shards for Elasticsearch. Please correct me if I'm wrong, this is what I've pieced together:
Ideally, there should only exist one shard per index, per node.
The only reason why we would ever want to configure more than
one shard IS to over-allocate for future growth (i.e. adding more
nodes to physically support the data).
Now, assuming what I have above is correct, I then wonder if there are any performance issues or differences if I only had one node with 1 shard versus one node with 5 shards. Can anyone enlighten me on this subject?
"The only reason why we would ever want to configure more than one shard IS to over-allocate for future growth (i.e. adding more nodes to physically support the data)."
Not necessarily so. Having more shards helps parallelise your queries and helps them finish faster, but after a bit it can be counterproductive as too many shards will mean overheads in merging the individual shard responses and time spent in queuing and such things.
"one node with 1 shard versus one node with 5 shards"
It depends on what your use case is but you should see some performance gain for bigger queries, with 5 shards.
I believe it depends on the size of the shards. For instance, on the elastic website, they say the following:
"Querying lots of small shards will make the processing per shard
faster, but as many more tasks need to be queued up and processed in
sequence, it is not necessarily going to be faster than querying a
smaller number of larger shards. Having lots of small shards can also
reduce the query throughput if there are multiple concurrent queries."
https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster
In practice I have found that using some exploratory testing with realistic queries helps me determine more definitively how I should move forward with my architecture. It really depends on the use case. As was stated previously however, there comes a point where you can sort of "over optimize" and it ends up cancelling out any noticible gains you may have otherwise obtained by doing the opposite solution.
To be succinct, one shard per index, per node is a fine practice. But if you find yourself needing more, then just assess your use case first and determine if additional shards are truly necessary.

Resource usage with rolling indices in Elasticsearch

My question is mostly based on the following article:
https://qbox.io/blog/optimizing-elasticsearch-how-many-shards-per-index
The article advises against having multiple shards per node for two reasons:
Each shard is essentially a Lucene index, it consumes file handles, memory, and CPU resources
Each search request will touch a copy of every shard in the index. Contention arises and performance decreases when the shards are competing for the same hardware resources
The article advocates the use of rolling indices for indices that see many writes and fewer reads.
Questions:
Do the problems of resource consumption by Lucene indices arise if the old indices are left open?
Do the problems of contention arise when searching over a large time range involving many indices and hence many shards?
How does searching many small indices compare to searching one large one?
I should mention that in our particular case, there is only one ES node though of course generally applicable answers will be more useful to SO readers.
It's very difficult to spit out general best practices and guidelines when it comes to cluster sizing as it depends on so many factors. If you ask five ES experts, you'll get ten different answers.
After several years of tinkering and fiddling around ES, I've found out that what works best for me is always to start small (one node, how many indices your app needs and one shard per index), load a representative data set (ideally your full data set) and load test to death. Your load testing scenarii should represent the real maximum load you're experiencing (or expecting) in your production environment during peak hours.
Increase the capacity of your cluster (add shard, add nodes, tune knobs, etc) until your load test pass and make sure to increase your capacity by a few more percent in order to allow for future growth. You don't want your production to be fine now, you want it to be fine in a year from now. Of course, it will depend on how fast your data will grow and it's very unlikely that you can predict with 100% certainty what will happen in a year from now. For that reason, as soon as my load test pass, if I expect a large exponential data growth, I usually increase the capacity by 50% more percent, knowing that I will have to revisit my cluster topology within a few month or a year.
So to answer your questions:
Yes, if old indices are left open, they will consume resources.
Yes, the more indices you search, the more resources you will need in order to go through every shard of every index. Be careful with aliases spanning many, many rolling indices (especially on a single node)
This is too broad to answer, as it again depends on the amount of data we're talking about and on what kind of query you're sending, whether it uses aggregation, sorting and/or scripting, etc
Do the problems of resource consumption by Lucene indices arise if the old indices are left open?
Yes.
Do the problems of contention arise when searching over a large time range involving many indices and hence many shards?
Yes.
How does searching many small indices compare to searching one large one?
When ES searches an index it will pick up one copy of each shard (be it replica or primary) and asks that copy to run the query on its own set of data. Searching a shard will use one thread from the search threadpool the node has (the threadpool is per node). One thread basically means one CPU core. If your node has 8 cores then at any given time the node can search concurrently 8 shards.
Imagine you have 100 shards on that node and your query will want to search all of them. ES will initiate the search and all 100 shards will compete for the 8 cores so some shards will have to wait some amount of time (microseconds, milliseconds etc) to get their share of those 8 cores. Having many shards means less documents on each and, thus, potentially a faster response time from each. But then the node that initiated the request needs to gather all the shards' responses and aggregate the final result. So, the response will be ready when the slowest shard finally responds with its set of results.
On the other hand, if you have a big index with very few shards, there is not so much contention for those CPU cores. But the shards having a lot of work to do individually, it can take more time to return back the individual result.
When choosing the number of shards many aspects need to be considered. But, for some rough guidelines yes, 30GB per shard is a good limit. But this won't work for everyone and for every use case and the article fails to mention that. If, for example, your index is using parent/child relationships those 30GB per shard might be too much and the response time of a single shard can be too slow.
You took this out of the context: "The article advises against having multiple shards per node". No, the article advises one to think about the aspects of structuring the indices shards before hand. One important step here is the testing one. Please, test your data before deciding how many shards you need.
You mentioned in the post "rolling indices", and I assume time-based indices. In this case, one question is about the retention period (for how long you need the data). Based on the answer to this question you can determine how many indices you'll have. Knowing how many indices you'll have gives you the total number of shards you'll have.
Also, with rolling indices, you need to take care of deleting the expired indices. Have a look at Curator for this.

Elasticsearch with different java heaps, does it matter?

Say, I've got 2 servers. One of which has -xmx and -xms set to 4G and one to 2G.
Will ElasticSearch handle those performance differences in the balancing mode? Or will both the servers be called purely based on indices, resulting in a (much more) likely OOM for the latter than the former?
By the way, I've set the properties indices.fielddata.cache.size, indices.breaker.fielddata.limit, indices.breaker.request.limit, and indices.breaker.total.limit on both servers as ElasticSearch is suggesting
This is important, to me, because if it does, I'd have to change the index sharding on guessed index strain, which will be a hassle (if not impossible)
Elasticsearch treats every nodes as the same and equally balances the documents between them. This means that Elasticsearch wont readjust based on hardware and get you the optimal performance.
One thing to remember here is that a herd of bulls is only as fast as its slowest bull. The same gets applied here. But then if the load is small enough that it does not eat up all the hardware for 2 GB machine ,then we should not be seeing any issue. Otherwise you should see difference in memory aggressive operations like aggregations.

max number of couchbase views per bucket

How many views per bucket is too much, assuming a large amount of data in the bucket (>100GB, >100M documents, >12 document types), and assuming each view applies only to one document type? Or asked another way, at what point should some document types be split into separate buckets to save on the overhead of processing all views on all document types?
I am having a hard time deciding how to split my data into couchbase buckets, and the performance implications of the views required on the data. My data consists of more than a dozen relational DBs, with at least half with hundreds of millions of rows in a number of tables.
The http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-views-writing-bestpractice.html doc section "using document types" seems to imply having multiple document types in the same bucket is not ideal because views on specific document types are updated for all documents, even those that will never match the view. Indeed, it suggests separating data into buckets to avoid this overhead.
Yet there is a limit of 10 buckets per cluster for performance reasons. My only conclusion therefore is that each cluster can handle a maximum of 10 large collections of documents efficiently. Is this accurate?
Tug's advice was right on and allow me to add some perspective as well.
A bucket can be considered most closely related to (though not exactly) a "database instantiation" within the RDMS world. There will be multiple tables/schemas within that "database" and those can all be combined within a bucket.
Think about a bucket as a logical grouping of data that all shares some common configuration parameters (RAM quota, replica count, etc) and you should only need to split your data into multiple buckets when you need certain datasets to be controlled separately. Other reasons are related to very different workloads to different datasets or the desire to be able to track the workload to those datasets separately.
Some examples:
-I want to control the caching behavior for one set of data differently than another. For instance, many customers have a "session" bucket that they want always in RAM whereas they may have a larger, "user profile" bucket that doesn't need all the data cached in RAM. Technically these two data sets could reside in one bucket and allow Couchbase to be intelligent about which data to keep in RAM, but you don't have as much guarantee or control that the session data won't get pushed out...so putting it in its own bucket allows you to enforce that. It also gives you the added benefit of being able to monitor that traffic separately.
-I want some data to be replicated more times than others. While we generally recommend only one replica in most clusters, there are times when our users choose certain datasets that they want replicated an extra time. This can be controlled via separate buckets.
-Along the same lines, I only want some data to be replicated to another cluster/datacenter. This is also controlled per-bucket and so that data could be split to a separate bucket.
-When you have fairly extreme differences in workload (especially around the amount of writes) to a given dataset, it does begin to make sense from a view/index perspective to separate the data into a separate bucket. I mention this because it's true, but I also want to be clear that it is not the common case. You should use this approach after you identify a problem, not before because you think you might.
Regarding this last point, yes every write to a bucket will be picked up by the indexing engine but by using document types within the JSON, you can abort the processing for a given document very quickly and it really shouldn't have a detrimental impact to have lots of data coming in that doesn't apply to certain views. If you don't mind, I'm particularly curious at which parts of the documentation imply otherwise since that certainly wasn't our intention.
So in general, we see most deployments with a low number of buckets (2-3) and only a few upwards of 5. Our limit of 10 comes from some known CPU and disk IO overhead of our internal tracking of statistics (the load or lack thereof on a bucket doesn't matter here). We certainly plan to reduce this overhead with future releases, but that still wouldn't change our recommendation of only having a few buckets. The advantages of being able to combine multiple "schemas" into a single logical grouping and apply view/indexes across that still exist regardless.
We are in the process right now of coming up with much more specific guidelines and sizing recommendations (I wrote those first two blogs as a stop-gap until we do).
As an initial approach, you want to try and keep the number of design documents around 4 because by default we process up to 4 in parallel. You can increase this number, but that should be matched by increased CPU and disk IO capacity. You'll then want to keep the number of views within each document relatively low, probably well below 10, since they are each processed in serial.
I recently worked with one user who had an fairly large amount of views (around 8 design documents and some dd's with nearly 20 views) and we were able to drastically bring this down by combining multiple views into one. Obviously it's very application dependent, but you should try to generate multiple different "queries" off of one index. Using reductions, key-prefixing (within the views), and collation, all combined with different range and grouping queries can make a single index that may appear crowded at first, but is actually very flexible.
The less design documents and views you have, the less disk space, IO and CPU resources you will need. There's never going to be a magic bullet or hard-and-fast guideline number unfortunately. In the end, YMMV and testing on your own dataset is better than any multi-page response I can write ;-)
Hope that helps, please don't hesitate to reach out to us directly if you have specific questions about your specific use case that you don't want published.
Perry
As you can see from the Couchbase documentation, it is not really possible to provide a "universal" rules to give you an exact member.
But based on the best practice document that you have used and some discussion(here) you should be able to design your database/views properly.
Let's start with the last question:
YES the reason why Couchbase advice to have a small number of bucket is for performance - and more importantly resources consumption- reason. I am inviting you to read these blog posts that help to understand what's going on "inside" Couchbase:
Sizing 1: http://blog.couchbase.com/how-many-nodes-part-1-introduction-sizing-couchbase-server-20-cluster
Sizing 2: http://blog.couchbase.com/how-many-nodes-part-2-sizing-couchbase-server-20-cluster
Compaction: http://blog.couchbase.com/compaction-magic-couchbase-server-20
So you will see that most of the "operations" are done by bucket.
So let's now look at the original question:
yes most the time your will organize the design document/and views by type of document.
It is NOT a problem to have all the document "types" in a single(few) buckets, this is in fact the way your work with Couchbase
The most important part to look is, the size of your doc (to see how "long" will be the parsing of the JSON) and how often the document will be created/updated, and also deleted, since the JS code of the view is ONLY executed when you create/change the document.
So what you should do:
1 single bucket
how many design documents? (how many types do you have?)
how any views in each document you will have?
In fact the most expensive part is not during the indexing or quering it is more when you have to rebalance the data and indices between nodes (add, remove , failure of nodes)
Finally, but it looks like you already know it, this chapter is quite good to understand how views works (how the index is created and used):
http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-views-operation.html
Do not hesitate to add more information if needed.

Resources