optimization on old indexes collecting logs from my apps - elasticsearch

I have an elastic cluster with 3x nodes(each 6x cpu, 31GB heap , 64GB RAM) collecting 25GB logs per day , but after 3x months I realized my dashboards become very slow when checking stats in past weeks , please, advice if there is an option to improve the indexes read erformance so it become faster when calculating my dashboard stats?
Thanks!

I would suggest you try to increase the shards number
when you have more shards Elasticsearch will split your data over the shards so as a result, Elastic will send multiple parallel requests to search in a smaller data stack
for Shards number you could try to split it based on your heap memory size
No matter what actual JVM heap size you have, the upper bound on the maximum shard count should be 20 shards per 1 GB of heap configured on the server.
ElasticSearch - Optimal number of Shards per node
https://qbox.io/blog/optimizing-elasticsearch-how-many-shards-per-index
https://opster.com/elasticsearch-glossary/elasticsearch-choose-number-of-shards/

It seems that the amount of data that you accumulated and use for your dashboard is causing performance problems.
A straightforward option is to increase your cluster's resources but then you're bound to hit the same problem again. So you should rather rethink your data retention policy.
Chances are that you are really only interested in most recent data. You need to answer the question what "recent" means in your use case and simply discard anything older than that.
Elasticsearch has tools to automate this, look into Index Lifecycle Management.
What you probably need is to create an index template and apply a lifecycle policy to it. Elasticsearch will then handle automatic rollover of indices, eviction of old data, even migration through data tiers in hot-warm-cold architecture if you really want very long retention periods.
All this will lead to a more predictable performance of your cluster.

Related

what value to be used for max_num_segments for Eslastic search 7.9 while doing forcemerge?

I am upgrading ES 1.4.2 to ES 7.9. For that I had to do remote indexing by re-indexing API to get all the data from old cluster to new cluster. But after re-indexning the search query performance of ES 7.9 got decreased a lot. So I am planning to do forcemerge in order to increase the query performance. Which value should I use for max_num_segments if I decide to to forcemerge? there are no guidelines provided.
The old cluster have 2 primary shards , one with 18 segments 16 and second with 18 segments. The new cluster after remote indexing has 2 primary shards one with 27 segments and second with 30 segments.
Pleased guide me on the value to be used for max_num_segments. Thanks
Although you have not provided any information about your cluster, index size, docs, replicas, heap size, your search query, etc so it's difficult to say what is the root cause of your bad search performance. but in general, having fewer segments improve the search performance and merge process happens automatically and in very rare-case it requires the manual intervention but still you can use the force merge API and it shouldn't cross the 5GB threshold.
Please refer to my this SO answer for more info on the merge process and API, try with max_num_segments=1 and ES will try to reduce it to the minimum number of segments(5 GB threshold might prevent some segments not eligible for merging).
Also, please use segment API to see the size of your segments and add it with other required information.

Elasticsearch Indexing Speed Degrade with the Time

I am doing some performance tuning in elastic search for my project and I need some help in improving the elastic search indexing speed. I am using ES 5.1.1 and I have 2 nodes setup with 8 shards for the index. I have the servers for 2 nodes with 16GB RAM and 12CPUs allocated for each server with 2.2GHz clock speed. I need to index around 25,000,000 documents within 1.5 hours, which I am currently doing in around 4 hours. I have done the following config changes to improve the indexing time.
Setting ‘indices.store.throttle.type’ to ‘none’
Setting ‘refresh_interval’ to ‘-1’
Increasing ‘translog.flush_threshold_size’ to 1GB
Setting ‘number_of_replicas’ to ‘0’
Using 8 shards for the index
Setting VM Options -Xms8g -Xmx8g (Half of the RAM size)
I am using the bulk processor to generate the documents in my java application and I’m using the following configurations to setup the bulk processor.
Bulk Actions Count : 10000
Bulk Size in MB : 100
Concurrent Requests : 100
Flush Interval : 30
Initially I can index around 356167 in the first minute. But with the time, It decreases and after around 1 hour its around 121280 docs per minute.
How can I keep the indexing rate steady over the time? Is there any other ways to improve the performance?
I highly encourage not to change configuration parameters like the translog flush size, the throttling, unless you know what you are doing (and this does not mean reading some blog post on the internet :-)
Try a single shard per server and especially reduce the bulk size to something like 10MB. 100MB * 100 concurrent requests means you need 10GB of heap to deal with those (without doing anything else). I suppose not all of the documents get indexed because of your rejected tasks in your threadpools.
Start small and get bigger instead of starting big but not have any insight in your indexing.

Is there a limit on the number of indexes that can be created on Elastic Search?

I'm using AWS-provided Elastic Search.
I have a signup page on my website, and on each signup; a new index for the new user gets created (to be used later by his work-group), which means that the number of indexes is continuously growing, (now it reached around 4~5k).
My question is: is there a performance limit on the number of indexes? is it safe (performance-wise) to keep creating new indexes dynamically with each new user?
Note: I haven't used AWS-Elasticsearch, so this answer may vary because they have started using open-distro of Elsticsearch and have forked the main branch. But a lot of principles should be the same. Also, this question doesn't have a definitive answer and it depends on various factors but I hope this answer will help the thought process.
One of the factors is the number of shards and replicas per index as that will contribute to the total number of shards per node. Each shard consumes some memory, so you will have to keep the number of shards limited per node so that they don't exceed maximum recommended 30GB heap space. As per this comment 600 to 1000 should be reasonable and you can scale your cluster according to that.
Also, you have to monitor the number of file descriptors and make sure that doesn't create any bottleneck for nodes to operate.
HTH!
If I'm not mistaken, the only limit is the disk space of your server, but if your index is growing too fast you should think about having more replica servers. I recomend reading this page: Indexing Performance Tips
Indexes themselves have no limit, however shards do, the recommended amount of shards per GB of heap is 20(JVM heap - you can check on kibana stack monitoring tab), this means if you have 5GB of JVM heap, the recommended amount is 100.
Remember that 1 index can take from 1 to x number of shards (1 primary and x secondary), normally people have 1 primary and 1 secondary, if this is you case then you would be able to create 50 indexes with those 5GB of heap

optimize elasticsearch / JVM

I have a website for classified. For this I'm using elasticsearch, postgres and rails on a same ubuntu 14.04 dedicated server, with 256GB of RAM and 20 cores, 40 threads.
I have 10 indexes on elasticsearch, each have default numbers of shards (5). They have between 1000 and 400 000 classifieds depending on which index.
approximately 5000 requests per minute, 2/3 making an elasticsearch request.
according to htop, jvm is using around 500% of CPU
I try different options, I reduce number of shards per index, I also try to change JAVA_OPTS as followed
#JAVA_OPTS="$JAVA_OPTS -XX:+UseParNewGC"
#JAVA_OPTS="$JAVA_OPTS -XX:+UseConcMarkSweepGC"
#JAVA_OPTS="$JAVA_OPTS -XX:CMSInitiatingOccupancyFraction=75"
#JAVA_OPTS="$JAVA_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
JAVA_OPTS="$JAVA_OPTS -XX:+UseG1GC"
but it doesn't seems to change anything.
so to questions :
when you change any setting on elasticsearch, and then restart, should the improvement (if any) be visible immediately or can it arrive a bit later thanks to cache or anything else ?
can any one help me to find good configuration for JVM / elasticsearch so it will not take that many resources
First, it's a horrible idea to run your web server, database and Elasticsearch server all on the same box. Each of these should be given it's own box, at least. In the case of Elasticsearch, it's actually recommended to have at least 3 servers, or nodes. That way you end up with a load balanced cluster that won't run into split-brain issues.
Further, sharding only makes sense in a cluster. If you only have one node, then all the shards reside on the same node. This causes two performance problems. First, you get the hit that sharding always adds. For every query, Elasticsearch must query each shard individually (each is a separate Lucene index). Then, it must combine and process the result from all the shards to produce the final result. That's a not insignificant amount of overhead. Second, because all the shards reside on the same node, you're I/O-locked. The shards have to be queried one at a time instead of all at once. Optimally, you should have one shard per node, however, since you can't create more shards without reindexing, it's common to have a few extra hanging around for future horizontal scaling. In that scenario, the cost of reindexing what could be 100's of gigs of data or more outweighs a little bit of performance bottleneck. However, if you've got 5 shards running one node, that's probably a large part of your performance problems right there.
Finally, and again, with Elasticsearch in particular, swapping is a huge no-no. Most of what makes Elasticsearch efficient is it's cache which all resides in RAM. If swaps occur, it jacks with the cache in sometimes unpredictable ways. As result, it's recommended to turn off swapping completely on the box your node(s) run on, and set Elasticsearch/JVM to have a min and max memory consumption of roughly half the available RAM of the box. That's virtually impossible to achieve if you have other things running on it like a web server or database. Databases in particular aggressively consume RAM in order to increase throughput, which is why those should likewise reside on their own servers.

Index linear growth - Performance degradation

We have 4 shards with 14GB index on each of them
Each shard has a master and 3 slaves (each of them with 32GB RAM)
We're expecting that the index size will grow to double or triple in near future.
So we thought of merging our indexes to 28GB index so that each shard has 28GB index and also increased our RAM on each slave to 48GB.
We made this changes locally and tested the server by sending same 10K realistic queries to each server with 14GB & 28GB index, we found that
For server with 14GB index (48GB RAM): search time was 480ms, number of index hits: 3.8G
For server with 28GB index (48GB RAM): search time was 900ms, number of index hits: 7.2G
So we saw that having the whole index in RAM doesn't help in sustaining the performance in terms of search time. Search time increased linearly to double when the index size was doubled.
We were thinking of keeping only 4 shards configuration but it looks like now we have to add another shard or another slave to each shard.
Is there any other way that we can configure our servers so that the performance isn't affected even when index size doubles or triples?
I'd hate to say it depends, but it... depends.
The total size of your index on each is 14GB, which basically doesn't mean much of anything to SOLR. To get a real feel for performance what is the uniqueness of the terms indexed? An index of 14GB worth of data with the single word "cat" in it over and over again will be really quick.
Also have you confirmed you need the following features, disabling them can boost performance large amounts:
Schema
Stored Fields
Do you need stored fields? Removing this can greatly increase performance (you can safely have an entire index without any stored fields and rely completely on facets, pivots, and other features in solr to drive a UX).
omitNorms
You can, in some instances, set this flag to false to reduce memory in general and increase performance.
omitTermFreqAndPositions
Can be turned off, reduced memory in general and increase in performance.
System
Optimize Core/Index (Segment Count)
Index optimization is important when dealing with larger index sizes. Ensure each core is optimized and that when you look at the core it says the segment count is = 1. What I found is that this play a more important role as you increase the index size (this plays into OS level file caching and the fact it's easier to read one large file, rather than multiple small files) And yes, that does say 171 million+ documents.
Term Index Interval/Frequency
Configuration of term index interval may be required (by default 256) if you have a field or multiple fields that contain very unique values (for example GUID/UUIDs or unique IDs in general). Typically, the lower the TIF the more memory you need, the higher the TIF the less memory you need but the more disk seeks you may have.
Allocation of too much Ram
Solr works best with a good split between OS level disk cache and RAM used when faceting, you'd be surprised that you could actually get better performance by tweaking other parameters which lower required ram usage and free up resources for disk.

Resources