How to determine what causes ES's query API instability - elasticsearch

Normally, my ES query API takes less than 1s.But sometimes these queries get slow.
cluster consists of three 32G machines (16G allocated to ES).The index consists of 20 primaries and 1 replica, 303,000,000 dos count and 500gb primaries storage size and 1tb storage size.
Here's kibana's monitoring data:
`
Personally, I think it's the result of GC. I want to add machines.But I need to find a reason to convince my leader.

Yes it could be a GC problem. But can you be more specific? What do you mean by slow?
Anyway it seems the allocated heap is way too large for your needs. You have a collection when the heap is at 12Go ( 75% of 16go ) and it goes back to 5go every time. Its generate huge garbage collection.
You should try to lower the heap to like 10Go and check the impact on performance GC count and GC duration.
I recommands you too read this article https://www.elastic.co/blog/a-heap-of-trouble especially the "Together We Can Prevent Forest Fires" part.

Related

how to decide the memory requirement for my elasticsearch server

I have a scenario here,
The Elasticsearch DB with about 1.4 TB of data having,
_shards": {
"total": 202,
"successful": 101,
"failed": 0
}
Each index size is approximately between, 3 GB to 30 GB and in near future, it is expected to have 30GB file size on a daily basis.
OS information:
NAME="Red Hat Enterprise Linux Server"
VERSION="7.2 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="7.2"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.2 (Maipo)"
The system has 32 GB of RAM and the filesystem is 2TB (1.4TB Utilised). I have configured a maximum of 15 GB for Elasticsearch server.
But this is not enough for me to query this DB. The server hangs for a single query hit on server.
I will be including 1TB on the filesystem in this server so that the total available filesystem size will be 3TB.
also I am planning to increase the memory to 128GB which is an approximate estimation.
Could someone help me calculate how to determine the minimum RAM required for a server to respond at least 50 requests simultaneously?
It would be greatly appreciated if you can suggest any tool/ formula to analyze this requirement. also it will be helpful if you can give me any other scenario with numbers so that I can use that to determine my resource need.
You will need to scale using several nodes to stay efficient.
Elasticsearch has its per-node memory sweet spot at 64GB with 32GB reserved for ES.
https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html#_memory for more details. The book is a very good read if you are using Elasticsearch for serious stuff
If you're here for a rule of thumb, I'd say that on modern ES and Java, 10-20GB of heap per TB of data (I'm thinking of the typical ELK use-case) should be enough. Multiplying by 2, that's 20-40GB of total RAM per TB.
Now for the datailed answer :) There are two types of memory that are relevant here:
JVM heap
OS cache (the OS will use free memory to cache index files)
OS cache is down to your IO requirements (queries do lots of small random IO). If you have a query-intensive use-case (e.g. E-commerce), you'll want to fit your whole index in the OS cache (or at least most of it). For logs and other time-series data, you typically have more expensive, rarer queries. There, if you have a local SSD you can make do with only a fraction of your data in the cache. I've seen servers with 4TB of disk space on 32GB of OS cache.
JVM heap can also be divided in two:
static memory, required even when the server is idle
transient memory, required by ongoing indexing/search operations
You'd see most of the static memory if you hit the _nodes/stats endpoint. It's best if you have these plotted in your Elasticsearch monitoring tool. You'll see it as segments_memory and various caches. For recent versions of Elasticsearch (e.g. 7.7 or higher), there's not a lot of memory like this - at least for most use-cases. I've seen ELK deployments with multiple TB of data definitely using less than 10GB of RAM for static memory. That said, you may reduce it by not storing info that you don't need. For example by not indexing fields you don't search on.
Transient memory will mainly depend on your queries: how often they run and how expensive they are. One-off expensive queries tend to be more dangerous, so avoid using too many levels of aggregations, massive size values, or queries that expand to too many terms (wildcards, fuzzy...). To accommodate those, you simply need heap. How much? It's really a matter of monitor-and-adjust.
Side-note: I don't agree with the general suggestion that you should stay under 32GB at all costs. With Java 11+ and G1GC, I've seen deployments with over 100GB of heap that run just fine. The overhead of uncompressed oops is not 10-20GB at every 30GB, like the docs suggest - that's an extrapolation of a worse-case scenario. In my experience, it's more like a few GB every 30GB - something like 10% for many deployments. This doesn't mean you have to use 100GB of heap, it's just that if you need a lot of heap in your cluster, you don't have to have hundreds of nodes (you can have fewer bigger ones).
Speaking of GC, it may fall behind if you run many queries that aren't terribly expensive. And then you'd run out of heap, even if you have plenty. Monitoring should tell you this, as a full GC will eventually clean up the heap with a big pause (read: cluster instability). Here, Java 11 with G1GC and a low -XX:GCTimeRatio (e.g. 3) should fix the issue.
This gives a good overview of heap sizing and memory management and you will be able to answer yourself.
https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html
https://www.elastic.co/guide/en/elasticsearch/guide/master/_limiting_memory_usage.html

Understanding elasticsearch jvm heap usage

Folks,
I am trying reduce my memory usage in my elasticsearch deployment (Single node cluster).
I can see 3GB JVM heap space being used.
To optimize I first need to understand the bottleneck.
I have limited understanding of how is JVM usage is split.
Field data looks to consume 1.5GB and filter cache & query cache combined consume less than 0.5GB, that adds upto 2GB at the max.
Can someone help me understand where does elasticsearch eats up rest of 1GB?
I can't tell for your exact setup, but in order to know what's going on in your heap, you can use the jvisualvm tool (bundled with the jdk) together with marvel or the bigdesk plugin (my preference) and the _cat APIs to analyze what's going on.
As you've rightly noticed, the heap hosts three main caches, namely:
the fielddata cache: unbounded by default, but can be controlled with indices.fielddata.cache.size (in your case it seems to be around 50% of the heap, probably due to the fielddata circuit breaker)
the node query/filter cache: 10% of the heap
the shard request cache: 1% of the heap but disabled by default
There is nice mindmap available here (Kudos to Igor Kupczyński) that summarizes the roles of caches. That leaves more or less ~30% of the heap (1GB in your case) for all other object instances that ES needs to create in order to function properly (see more about this later).
Here is how I proceeded on my local env. First, I started my node fresh (with Xmx1g) and waited for green status. Then I started jvisualvm and hooked it onto my elasticsearch process. I took a heap dump from the Sampler tab so I can compare it later on with another dump. My heap looks like this initially (only 1/3 of max heap allocated so far):
I also checked that my field data and filter caches were empty:
Just to make sure, I also ran /_cat/fielddata and as you can see there's no heap used by field data yet since the node just started.
$ curl 'localhost:9200/_cat/fielddata?bytes=b&v'
id host ip node total
TMVa3S2oTUWOElsBrgFhuw iMac.local 192.168.1.100 Tumbler 0
This is the initial situation. Now, we need to warm this all up a bit, so I started my back- and front-end apps to put some pressure on the local ES node.
After a while, my heap looks like this, so its size has more or less increased by 300 MB (139MB -> 452MB, not much but I ran this experiment on a small dataset)
My caches have also grown a bit to a few megabytes:
$ curl 'localhost:9200/_cat/fielddata?bytes=b&v'
id host ip node total
TMVa3S2oTUWOElsBrgFhuw iMac.local 192.168.1.100 Tumbler 9066424
At this point I took another heap dump to gain insights into how the heap had evolved, I computed the retained size of the objects and I compared it with the first dump I took just after starting the node. The comparison looks like this:
Among the objects that increased in retained size, he usual suspects are maps, of course, and any cache-related entities. But we can also find the following classes:
NIOFSDirectory that are used to read Lucene segment files on the filesystem
A lot of interned strings in the form of char arrays or byte arrays
Doc values related classes
Bit sets
etc
As you can see, the heap hosts the three main caches, but it is also the place where reside all other Java objects that the Elasticsearch process needs and that are not necessarily cache-related.
So if you want to control your heap usage, you obviously have no control over the internal objects that ES needs to function properly, but you can definitely influence the sizing of your caches. If you follow the links in the first bullet list, you'll get a precise idea of what settings you can tune.
Also tuning caches might not be the only option, maybe you need to rewrite some of your queries to be more memory-friendly or change your analyzers or some fields types in your mapping, etc. Hard to tell in your case, without more information, but this should give you some leads.
Go ahead and launch jvisualvm the same way I did here and learn how your heap is growing while your app (searching+indexing) is hitting ES and you should quickly gain some insights into what's going on in there.
Marvel only plots some instances on the heap which needs to be monitored like Caches in this case.
The caches represent only a portion of the total heap usage. There are a lot many other instances which will occupy the heap memory and those may not have a direct plotting on this marvel interface.
Hence, Not all heap occupied in ES is only by the cache.
In order to clearly understand the exact usage of heap by different instances, you should take heap dump of the process and then analyze it using a Memory Analyzer tool which can provide you with the exact picture.

MongoDB insert performance with 2nd index

I'm trying to insert about 250 million documents that are each roughly 400 bytes into MongoDB 3.0 with WiredTiger. I need to search on only one short string key, _user_lower. Although I'm using WiredTiger now, which is much better than MMAPv1, I did use MMAPv1 first and had similar issues.
My server (a very cheap VPS) has:
250 GB magnetic disk
1 GB RAM
2 GB Swap
2.1 GHz single-core CPU
I know that this machine is really slow, and I'm asking it to do something a bit unrealistic. But I'm confused about how it started so fast with one index, and the second just ruined the performance:
I inserted all the data that I had at the time (about 250M rows) without any index except on _id. This performed very well, considering my awful hardware:
Approximately 5000 inserts per second (totally acceptable)
This rate was nearly constant for the 14 hours hours it took to complete
The index size on _id once complete was nearly 2.5GB. Note that this is more than double my physical RAM.
The RES of the process didn't exceed 450 MB according to mongostat.
No swapping
top seemed to indicate that CPU time wasn't all being spent waiting for the disk (so a significant amount was spent in userspace, presumably with WiredTiger in the snappy code)
Then I built a (non-unique) index on the only field I need to query by, _user_lower. This took 7.7 hours, which is fine since that's a one-time deal. The index ended up being 1.6 GB, which seems really low to me when compared to the _id index. The RES went up to about 750 MB.
Then, I downloaded a new data set to load. It was only 102 MB (238 K documents). I loaded it in the same way, using mongoimport, but this time:
Only 80 inserts per second (slower at times)
RES stayed at around 750 MB
top says almost 100% of the CPU was spent waiting for IO
Of course, load went through the roof.
I could understand a sizable performance hit, since that index has to be updated. But I didn't expect this much. I've read all over the place that my indexes should fit in RAM, but the performance was great during the initial insert, where the index quickly outgrew my memory.
Can I optimize the _user_index index at all? I don't know what this would even mean, but maybe only index the first few characters? I'm definitely willing to halve the query performance in exchange for tripling the insert performance.
What accounts for the massive performance hit? How do I fix it without new hardware? I'm not really attached to MongoDB, so alternatives that don't have these performance characteristics are fine. I have an idea that just uses flat files which would probably work but I don't want to write all that code.
When adding new items to a collection, the database will have to keep the index up-to-date. Since the index in MongoDB is a B-Tree by default, that means it will have to insert an item in the tree. While that isn't a particularly expensive operation in the best case, it comes with two potential performance problems:
performance jitter: from time to time, the B-Tree bucket might be full, requiring a bucket split and hence a lot more operations than the 'simple' insert
the insert destination must be readily available
In this case, the latter is likely to cause trouble: because the insertion of a name hits a random node in the tree (i.e, the name insertion doesn't follow a pattern) and your RAM is smaller than the index, chances are high that the destination must be fetched from disk. Unfortunately, the performance of disk seeks is orders of magnitude lower than main memory references. If you're unlucky, the first ref location requires another disk seek such that for a single insert multiple disk reads are required before MongoDB can even begin writing. That can take hundreds of milliseconds, with spinning disks or some contention on typical IaaS infrastructure even seconds.
Because ObjectIds are generated monotonically (the timestamp is the most significant part), the insertion always happens at the end and it is possible to keep the destination largely in RAM. Performance jitter, i.e. problem 1 might still be an issue since a bucket split might require a disk seek, but it happens so rarely compared to the first case that it doesn't wreck average performance, which should explain the observed behavior.
Also, when the bucket is filled by a monotonically increasing value, MongoDB will split the bucket when it is 90% filled; with random insertion, splits will happen a lot earlier, at 50%, so the tree is a little more 'dense' in that case.

Elastic Search - Maximum Shard Size

I came across and couldn't reach a final conclusion during learning ElasticSearch.
What is the maximum shard size for ElasticSearch?
How many shards can an index have? Is there any maximum limit?
After reading multiple articles and blogs and running my own load tests, I came to the conclusion that
number of shards and maximum size of each shard depends upon many factors like:
Size of the data inserted
Rate at which the data is inserted
Whether data retrieval / search is happening at the same time? If yes, what is the frequency of search? How many concurrent searches are done?
Server configuration details, like number of cores in CPU, hard disk size, Memory size etc
So, to find out the optimized size for each shard and optimized number of shards for a deployment, one good way is to run tests using various combinations of parameters & loads and arrive at a conclusion.
Simple : Don’t Cross 4 Billions documents
Think about the limit of 32 bits systems of the Heap Size (still valid for 64 bits systems). ES recommand half memory up to 32 GB even for 64 bits systems, as it's concern memory handeling limit and optimization. If you have more than 64 GB of memory, you can keep further memory for Lucene?
For further details : https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html and https://qbox.io/blog/optimizing-elasticsearch-how-many-shards-per-index .
As others have said, the theoretical maximum is very large, however depending on your system, there can be practical limits.
I've found that shards start to become less performant around 150GB. I've had 50GB shards that perform reasonably well. In both cases, the shard was the only shard on the node, and the node had 54GB of system memory, with 31GB devoted to elasticsearch. At 50GB, I was getting results from relatively heavy-duty queries around 100ms, and at 150GB it was taking 500ms or longer.
I'm sure this depends on the mappings I've used, and a host of other factors, but perhaps it's useful if you're polling for datapoints.

Index linear growth - Performance degradation

We have 4 shards with 14GB index on each of them
Each shard has a master and 3 slaves (each of them with 32GB RAM)
We're expecting that the index size will grow to double or triple in near future.
So we thought of merging our indexes to 28GB index so that each shard has 28GB index and also increased our RAM on each slave to 48GB.
We made this changes locally and tested the server by sending same 10K realistic queries to each server with 14GB & 28GB index, we found that
For server with 14GB index (48GB RAM): search time was 480ms, number of index hits: 3.8G
For server with 28GB index (48GB RAM): search time was 900ms, number of index hits: 7.2G
So we saw that having the whole index in RAM doesn't help in sustaining the performance in terms of search time. Search time increased linearly to double when the index size was doubled.
We were thinking of keeping only 4 shards configuration but it looks like now we have to add another shard or another slave to each shard.
Is there any other way that we can configure our servers so that the performance isn't affected even when index size doubles or triples?
I'd hate to say it depends, but it... depends.
The total size of your index on each is 14GB, which basically doesn't mean much of anything to SOLR. To get a real feel for performance what is the uniqueness of the terms indexed? An index of 14GB worth of data with the single word "cat" in it over and over again will be really quick.
Also have you confirmed you need the following features, disabling them can boost performance large amounts:
Schema
Stored Fields
Do you need stored fields? Removing this can greatly increase performance (you can safely have an entire index without any stored fields and rely completely on facets, pivots, and other features in solr to drive a UX).
omitNorms
You can, in some instances, set this flag to false to reduce memory in general and increase performance.
omitTermFreqAndPositions
Can be turned off, reduced memory in general and increase in performance.
System
Optimize Core/Index (Segment Count)
Index optimization is important when dealing with larger index sizes. Ensure each core is optimized and that when you look at the core it says the segment count is = 1. What I found is that this play a more important role as you increase the index size (this plays into OS level file caching and the fact it's easier to read one large file, rather than multiple small files) And yes, that does say 171 million+ documents.
Term Index Interval/Frequency
Configuration of term index interval may be required (by default 256) if you have a field or multiple fields that contain very unique values (for example GUID/UUIDs or unique IDs in general). Typically, the lower the TIF the more memory you need, the higher the TIF the less memory you need but the more disk seeks you may have.
Allocation of too much Ram
Solr works best with a good split between OS level disk cache and RAM used when faceting, you'd be surprised that you could actually get better performance by tweaking other parameters which lower required ram usage and free up resources for disk.

Resources