A very long delay between indexing a document in ElasticSearch and seeing it in search results - performance

We are experiencing a problem with ES 5.4.2 on production. About 25% of our indexed documents are only appearing in search results after 3 seconds or more after success response from indexing request. For some documents, it's more than 7 seconds. Our cluster consists of three nodes with 59GBs of RAM of which 31GBs are dedicated to ES heap. Heap usage never goes over 75% with GC dropping it down to ~25-30%. Indexing rate is relatively low, less than 10 documents per second. Our average refresh time is ~300 ms. Refresh interval is default 1s.
With these numbers, I am expecting at most ~1.5 s delay between indexation and search availability. What am I missing here? Is it possible to reduce this search availability delay? Maybe I'm missing some metric that is worth monitoring?

Related

optimization on old indexes collecting logs from my apps

I have an elastic cluster with 3x nodes(each 6x cpu, 31GB heap , 64GB RAM) collecting 25GB logs per day , but after 3x months I realized my dashboards become very slow when checking stats in past weeks , please, advice if there is an option to improve the indexes read erformance so it become faster when calculating my dashboard stats?
Thanks!
I would suggest you try to increase the shards number
when you have more shards Elasticsearch will split your data over the shards so as a result, Elastic will send multiple parallel requests to search in a smaller data stack
for Shards number you could try to split it based on your heap memory size
No matter what actual JVM heap size you have, the upper bound on the maximum shard count should be 20 shards per 1 GB of heap configured on the server.
ElasticSearch - Optimal number of Shards per node
https://qbox.io/blog/optimizing-elasticsearch-how-many-shards-per-index
https://opster.com/elasticsearch-glossary/elasticsearch-choose-number-of-shards/
It seems that the amount of data that you accumulated and use for your dashboard is causing performance problems.
A straightforward option is to increase your cluster's resources but then you're bound to hit the same problem again. So you should rather rethink your data retention policy.
Chances are that you are really only interested in most recent data. You need to answer the question what "recent" means in your use case and simply discard anything older than that.
Elasticsearch has tools to automate this, look into Index Lifecycle Management.
What you probably need is to create an index template and apply a lifecycle policy to it. Elasticsearch will then handle automatic rollover of indices, eviction of old data, even migration through data tiers in hot-warm-cold architecture if you really want very long retention periods.
All this will lead to a more predictable performance of your cluster.

Increase bulk Index requests performance of Elasticsearch

tl;dr: Currently, I am using Elasticsearch 5.6 as a data store for my go application. I have a million documents spread in 12 indices. Each index has 5 shards and 2 replicas. First, I load those docs from Elasticsearch and then process them and then index them in bulk with 10000 docs/batch rate. I run 3 workers which have one async goroutine per worker at a time. Those goroutines are per index so they send bulk index requests per index. That means a worker sends around 100,000 docs in a goroutine. The docs are sent in batches, each goroutine sends almost 10 batches. This entire stuff takes more than a minute. Most of the time is taken in bulk indexing.
My current Elasticsearch is running with 6GB RAM and 3.5GB heap size. I tried to tune Elasticsearch to improve indexing speed by increasing index buffer size to 20% ie. 700MB. I disabled indexing for fields which don't need indexing. I optimised numeric fields types in mappings. I disabled _all field. I changed index codec (compression method) to best_compression. After doing all this, there is not much improvement.
So I would like to get ideas to improve bulk indexing performance to finish all process within a minute. Will it be improved if I add more RAM and heap size to Elasticsearch? Any other settings/tuning?
Right now, I am in development phase so I can switch to other data store system as well which can suit my requirements of reading, writing and analyzing data in quick time. Such ideas are also welcomed.

Improving Response Time and Reducing CPU Utilization while using Elasticsearch Percolation

I'm looking for ways to cut down the response time on elasticsearch percolation and reducing the CPU utilization while, its being performed.
Tried a bunch of steps and was successful in bring down the response time but, it impacted the CPU utilization. I'm using elasticsearch 5.6 and I'm checking whether, I can get a response time of less than 2 seconds at-least.
The steps are mentioned below:
Tried running the percolator query with 1 node and 1 shard. The response time was very poor. It varied between 40 - 37 seconds.
Tried running the percolator query with 1 node and 3 shards. The response time was better but, not great. It varied between 16 - 14 seconds. This was a scenario, where I attempted over-allocation of shards to see, if that made a difference and though, the response time got better, the CPU utilization was over 90% on a 4 core and 32 GB VM. Memory spike was there but, nothing alarming. I think, memory would become a concern, if consecutive percolator queries would have been attempted.
Tried running the percolator query with 1 node and 10 shards. The response time was better but, not great. It varied between 15 - 13 seconds.
Checked out some links on elasticsearch git discussion and tried reducing the terms but, that started affected the scoring and hence, had to abandon this step as, the scoring and matching should be consistent for the use case, I'm trying out.
Links that, I referred to are mentioned below.
How to improve percolator performance in ElasticSearch?
https://github.com/elastic/elasticsearch/issues/26307
https://github.com/elastic/elasticsearch/issues/25445

Elasticsearch Indexing Speed Degrade with the Time

I am doing some performance tuning in elastic search for my project and I need some help in improving the elastic search indexing speed. I am using ES 5.1.1 and I have 2 nodes setup with 8 shards for the index. I have the servers for 2 nodes with 16GB RAM and 12CPUs allocated for each server with 2.2GHz clock speed. I need to index around 25,000,000 documents within 1.5 hours, which I am currently doing in around 4 hours. I have done the following config changes to improve the indexing time.
Setting ‘indices.store.throttle.type’ to ‘none’
Setting ‘refresh_interval’ to ‘-1’
Increasing ‘translog.flush_threshold_size’ to 1GB
Setting ‘number_of_replicas’ to ‘0’
Using 8 shards for the index
Setting VM Options -Xms8g -Xmx8g (Half of the RAM size)
I am using the bulk processor to generate the documents in my java application and I’m using the following configurations to setup the bulk processor.
Bulk Actions Count : 10000
Bulk Size in MB : 100
Concurrent Requests : 100
Flush Interval : 30
Initially I can index around 356167 in the first minute. But with the time, It decreases and after around 1 hour its around 121280 docs per minute.
How can I keep the indexing rate steady over the time? Is there any other ways to improve the performance?
I highly encourage not to change configuration parameters like the translog flush size, the throttling, unless you know what you are doing (and this does not mean reading some blog post on the internet :-)
Try a single shard per server and especially reduce the bulk size to something like 10MB. 100MB * 100 concurrent requests means you need 10GB of heap to deal with those (without doing anything else). I suppose not all of the documents get indexed because of your rejected tasks in your threadpools.
Start small and get bigger instead of starting big but not have any insight in your indexing.

Performance issues when pushing data at a constant rate to Elasticsearch on multiple indexes at the same time

We are experiencing some performance issues or anomalies on a elasticsearch specifically on a system we are currently building.
The requirements:
We need to capture data for multiple of our customers, who will query and report on them on a near real time basis. All the documents received are the same format with the same properties and are in a flat structure (all fields are of primary type and no nested objects). We want to keep each customer’s information separate from each other.
Frequency of data received and queried:
We receive data for each customer at a fluctuating rate of 200 to 700 documents per second – with the peak being in the middle of the day.
Queries will be mostly aggregations over around 12 million documents per customer – histogram/percentiles to show patterns over time and the occasional raw document retrieval to find out what happened a particular point in time. We are aiming to serve 50 to 100 customer at varying rates of documents inserted – the smallest one could be 20 docs/sec to the largest one peaking at 1000 docs/sec for some minutes.
How are we storing the data:
Each customer has one index per day. For example, if we have 5 customers, there will be a total of 35 indexes for the whole week. The reason we break it per day is because it is mostly the latest two that get queried with occasionally the remaining others. We also do it that way so we can delete older indexes independently of customers (some may want to keep 7 days, some 14 days’ worth of data)
How we are inserting:
We are sending data in batches of 10 to 2000 – every second. One document is around 900bytes raw.
Environment
AWS C3-Large – 3 nodes
All indexes are created with 10 shards with 2 replica for the test purposes
Both Elasticsearch 1.3.2 and 1.4.1
What we have noticed:
If I push data to one index only, Response time starts at 80 to 100ms for each batch inserted when the rate of insert is around 100 documents per second. I ramp it up and I can reach 1600 before the rate of insert goes to close to 1sec per batch and when I increase it to close to 1700, it will hit a wall at some point because of concurrent insertions and the time will spiral to 4 or 5 seconds. Saying that, if I reduce the rate of inserts, Elasticsearch recovers nicely. CPU usage increases as rate increases.
If I push to 2 indexes concurrently, I can reach a total of 1100 and CPU goes up to 93% around 900 documents per second.
If I push to 3 indexes concurrently, I can reach a total of 150 and CPU goes up to 95 to 97%. I tried it many times. The interesting thing is that response time is around 109ms at the time. I can increase the load to 900 and response time will still be around 400 to 600 but CPU stays up.
Question:
Looking at our requirements and findings above, is the design convenient for what’s asked? Are there any tests that I can do to find out more? Is there any setting that I need to check (and change)?
I've been hosting thousands of Elasticsearch clusters on AWS over at https://bonsai.io for the last few years, and have had many a capacity planning conversation that sound like this.
First off, it sounds to me like you have a pretty good cluster design and test rig going here. My first intuition here is that you are legitimately approaching the limits of your c3.large instances, and will want to bump up to a c3.xlarge (or bigger) fairly soon.
An index per tenant per day could be reasonable, if you have relatively few tenants. You may consider an index per day for all tenants, using filters to focus your searches on specific tenants. And unless there are obvious cost savings to discarding old data, then filters should suffice to enforce data retention windows as well.
The primary benefit of segmenting your indices per tenant would be to move your tenants between different Elasticsearch clusters. This could help if you have some tenants with wildly larger usage than others. Or to reduce the potential for Elasticsearch's cluster state management to be a single point of failure for all tenants.
A few other things to keep in mind that may help explain the performance variance you're seeing.
Most importantly here, indexing is incredibly CPU bottlenecked. This makes sense, because Elasticsearch and Lucene are fundamentally just really fancy string parsers, and you're sending piles of strings. (Piles are a legitimate unit of measurement here, right?) Your primary bottleneck is going to be the number and speed of your CPU cores.
In order to take the best advantage of your CPU resources while indexing, you should consider the number of primary shards you're using. I'd recommend starting with three primary shards to distribute the CPU load evenly across the three nodes in your cluster.
For production, you'll almost certainly end up on larger servers. The goal is for your total CPU load for your peak indexing requirements ends up under 50%, so you have some additional overhead for processing your searches. Aggregations are also fairly CPU hungry. The extra performance overhead is also helpful for gracefully handling any other unforeseen circumstances.
You mention pushing to multiple indices concurrently. I would avoid concurrency when bulk updating into Elasticsearch, in favor of batch updating with the Bulk API. You can bulk load documents for multiple indices with the cluster-level /_bulk endpoint. Let Elasticsearch manage the concurrency internally without adding to the overhead of parsing more HTTP connections.
That's just a quick introduction to the subject of performance benchmarking. The Elasticsearch docs have a good article on Hardware which may also help you plan your cluster size.

Resources