Elasticsearch config tweaking with limited memory

Elasticsearch config tweaking with limited memory - elasticsearch

I have following scenario:
A single machine with 32GB of ram runs Elasticsearch 2.4, there is one index with 5 shards that is 25gb in size.
On that index we are constantly indexing new data, plus doing full-text search queries that check about 95% documents - no aggregations. The instance generates a lot of CPU load - there is no swapping.
My question is: how should I tweak elasticsearch memory usage? (I don't have an option to add another machine at this moment)
Should I assign more memory to ES HEAP like 25GB (going over 50% memory that readme advises to not do do), or should I assign minimal HEAP like 1GB-2GB and assume Lucene will cache all the index in memory since its full-text searches?

Right now 50% of server memory so 16GB in this case seems to work best for us.

Related

Does decrease elasticsearch heap size could help improve search performance?

From 01-13, search performance has slowed down, maybe some metrics has reached a critical value, e.g the index doc count or store size.
From official document, I got
Elasticsearch heavily relies on the filesystem cache in order to make search fast. In general, you should make sure that at least half the available memory goes to the filesystem cache so that Elasticsearch can keep hot regions of the index in physical memory.
And now the maxed used heap is 11.72GB and elasticsearch app specified 16G (-Xms16g -Xmx16g)
So if I changed elasticsearch heap size to 12GB(-Xms12g -Xmx12g), does filesystem cache could used more memory and the search performance could improve?

Elasticsearch: What if size of index is larger than available RAM?

Assuming a single machine system with an in-memory indexing schema.
I am not able to find this info in ES docs. Does ES start swapping out the overflowing data, loads it when needed and continue working or it gives an error?

In-memory indices provide better performance at the cost of limiting the index size to the amount of available physical memory.
Via the 1.7 documentation. Memory stores are no longer available in 2.0+.
Under the hood it uses the Lucene RAMDirectory, which will just consume RAM (and eventually swap) until either you hit Java heap limits and ES crashes with out-of-memory errors, or the system gives up and oomkills the Elasticsearch process. Don't use in-memory indexes for large indexes, or for any situation where persistence is important.

How does ElasticSearch and Lucene share the memory

I have one question about the following quota from ES official doc:
But if you give all available memory to Elasticsearch’s heap,
there won’t be any left over for Lucene.
This can seriously impact the performance of full-text search.
If my server has 80G memory, I issued the following command to start ES node: bin/elasticsearch -xmx 30g
That means I only give the process of ES 30g memory maximum. How can Lucene use the left 50G, since Lucene is running in ES process, it's just part of the process.

The Xmx parameter simply indicates how much heap you allocate to the ES Java process. But allocating RAM to the heap is not the only way to use the available memory on a server.
Lucene does indeed run inside the ES process, but Lucene doesn't only make use of the allocated heap, it also uses memory by heavily leveraging the file system cache for managing index segment files.
There were these two great blog posts (this one and this other one) from Lucene's main committer which explain in greater details how Lucene leverages all the available remaining memory.
The bottom line is to allocate 30GB heap to the ES process (using -Xmx30g) and then Lucene will happily consume whatever is left to do what needs to be done.

Lucene uses the off heap memory via the OS. It is described in the Elasticsearch guide in the section about Heap sizing and swapping.
Lucene is designed to leverage the underlying OS for caching in-memory data structures. Lucene segments are stored in individual files. Because segments are immutable, these files never change. This makes them very cache friendly, and the underlying OS will happily keep hot segments resident in memory for faster access.

10 Gb JVM Heap Memory full but only 1Gb Field data cache

We have a ES 1.6 cluster with 4 nodes used to store mostly logging data (~500 documents a second).
ES is configured with 10G of Heap but after numerous OutOfMemoryExceptions and stop-the-world GCs we limited the Field data cache to 10%.
My question is, why are all nodes' JVM constantly using ~9Gb Heap when field data (which i understand to be one of the primary users of Heap) is limited to 1Gb.
Some graphs:
It's worth pointing out our filter cache size is much smaller (~200Mb) and yes the aggressively limited Field data cache size does cause a lot of Field data cache evictions.
What else is using so much heap?
Thanks

How to improve percolator performance in ElasticSearch?

Summary
We need to increase percolator performance (throughput).
Most likely approach is scaling out to multiple servers.
Questions
How to do scaling out right?
1) Would increasing number of shards in underlying index allow running more percolate requests in parallel?
2) How much memory does ElasticSearch server need if it does percolation only?
Is it better to have 2 servers with 4GB RAM or one server with 16GB RAM?
3) Would having SSD meaningfully help percolator's performance, or it is better to increase RAM and/or number of nodes?
Our current situation
We have 200,000 queries (job search alerts) in our job index.
We are able to run 4 parallel queues that call percolator.
Every query is able to percolate batch of 50 jobs in about 35 seconds, so we can percolate about:
4 queues * 50 jobs per batch / 35 seconds * 60 seconds in minute = 343
jobs per minute
We need more.
Our jobs index have 4 shards and we are using .percolator sitting on top of that jobs index.
Hardware: 2 processors server with 32 cores total. 32GB RAM.
We allocated 8GB RAM to ElasticSearch.
When percolator is working, 4 percolation queues I mentioned above consume about 50% of CPU.
When we tried to increase number of parallel percolation queues from 4 to 6, CPU utilization jumped to 75%+.
What is worse, percolator started to fail with NoShardAvailableActionException:
[2015-03-04 09:46:22,221][DEBUG][action.percolate ] [Cletus
Kasady] [jobs][3] Shard multi percolate failure
org.elasticsearch.action.NoShardAvailableActionException: [jobs][3]
null
That error seems to suggest that we should increase number of shards and eventually add dedicated ElasticSearch server (+ later increase number of nodes).
Related:
How to Optimize elasticsearch percolator index Memory Performance

Answers
How to do scaling out right?
Q: 1) Would increasing number of shards in underlying index allow running more percolate requests in parallel?
A: No. Sharding is only really useful when creating a cluster. Additional shards on a single instance may in fact worsen performance. In general the number of shards should equal the number of nodes for optimal performance.
Q: 2) How much memory does ElasticSearch server need if it does percolation only?
Is it better to have 2 servers with 4GB RAM or one server with 16GB RAM?
A: Percolator Indices reside entirely in memory so the answer is A LOT. It is entirely dependent on the size of your index. In my experience 200 000 searches would require a 50MB index. In memory this index would occupy around 500MB of heap memory. Therefore 4 GB RAM should be enough if this is all you're running. I would suggest more nodes in your case. However as the size of your index grows, you will need to add RAM.
Q: 3) Would having SSD meaningfully help percolator's performance, or it is better to increase RAM and/or number of nodes?
A: I doubt it. As I said before percolators reside in memory so disk performance isn't much of a bottleneck.
EDIT: Don't take my word on those memory estimates. Check out the site plugins on the main ES site. I found Big Desk particularly helpful for watching performance counters for scaling and planning purposes. This should give you more valuable info on estimating your specific requirements.
EDIT in response to comment from #DennisGorelik below:
I got those numbers purely from observation but on reflection they make sense.
200K Queries to 50MB on disk: This ratio means the average query occupies 250 bytes when serialized to disk.
50MB index to 500MB on heap: Rather than serialized objects on disk we are dealing with in memory Java objects. Think about deserializing XML (or any data format really) you generally get 10x larger in-memory objects.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio