how to limit memory usage of elasticsearch in ubuntu 17.10? - elasticsearch

My elasticsearch service is consuming around 1 gb.
My total memory is 2gb. The elasticsearch service keeps getting shut down. I guess the reason is because of the high memory consumption. How can i limit the usage to just 512 MB?
This is the memory before starting elastic search
After running sudo service elasticsearch start the memory consumption jumps
I appreciate any help! Thanks!

From the official doc
The default installation of Elasticsearch is configured with a 1 GB heap. For just about every deployment, this number is usually too small. If you are using the default heap values, your cluster is probably configured incorrectly.
So you can change it like this
There are two ways to change the heap size in Elasticsearch. The easiest is to set an environment variable called ES_HEAP_SIZE. When the server process starts, it will read this environment variable and set the heap accordingly. As an example, you can set it via the command line as follows: export ES_HEAP_SIZE=512m
But it's not recommended. You just can't run an Elasticsearch in the optimal way with so few RAM available.

Related

memory management for elasticsearch

I am trying to calculate good balance of total memory in three node es cluster.
If I have three node e.s cluster each with 32G memory, 8 vcpu. Which combination would be more suitable for balancing memory between all the components? I know there will be no fixed answers but just trying to get as accurate as I can.
different elasticsearch components will be used are beats (filebeat, metricbeat,heartbeat), logstash, elasticsearch, kibana.
most use case for this cluster will be, application logs getting indexed and running query on them like fetch average response time for 7 days,30 days, how many are different status codes for last 24 hrs, 7 days etc through curl calls, so aggregation will be used and other use case is monitoring, seeing logs through kibana but no ML jobs or dashboard creation etc..
After going through below official docs, its recommended to set heap size as below,
logstash -
https://www.elastic.co/guide/en/logstash/current/jvm-settings.html#heap-size
The recommended heap size for typical ingestion scenarios should be no less than 4GB and no more than 8GB.
elasticsearch -
https://www.elastic.co/guide/en/elasticsearch/reference/current/advanced-configuration.html#set-jvm-heap-size
Set Xms and Xmx to no more than 50% of your total memory. Elasticsearch requires memory for purposes other than the JVM heap
Kibana -
I have't found default or recommended memory for kibana but in our test cluster of single node of 8G memory it is taking 1.4G as total (256 MB/1.4 GB)
beats -
not found what is the default or recommended memory for beats but they will also consume more or less.
What should the ideal combination from below?
32G = 16G for OS + 16G for Elasticsearch heap.
for logstash 4G from 16G of OS, say three beats will consume 4G, kibana 2G
this leaves OS with 6G and if any new component has to be install in future like say APM or any other OS related then they all will have only 6G with OS.
Above is, per offical recommendation for all components. (i.e 50% for OS and 50% for es)
32G = 8G for elasticsearch heap. (25% for elasticsearch)
4G for logstash + beats 4G + kibana 2G
this leaves 14G for OS and for any future component.
I am missing to cover something that can change this memory combination?
Any suggestion by changing in above combination or any new combination is appreciated.
Thanks,

AWS ElasticSearch Java Process Limit

AWS documentation makes clear the following:
Java Process Limit
Amazon ES limits Java processes to a heap size of 32 GB. Advanced users can specify the percentage of the heap used for field data. For more information, see Configuring Advanced Options and JVM OutOfMemoryError.
Elastic search instance types span right up to 500GB memory - so my question (as a Java / JVM amateur) is how many Java processes does ElasticSearch run? I assume a 500GB ElasticSearch instance (r4.16xlarge.elasticsearch) is somehow going to make use of more than 32GB + any host system overhead?
Elasticsearch uses one java process (per node).
Indeed as quoted it is advised not to go over the 32GB RAM from performance efficiency reasons (the JVM would need to use 64bits pointers, which would decrease performance).
Another recommendation is to keep memory for the file system cache, which lucene uses heavily in order to load doc-values, and info from disk into memory.
Depending on your workload, it is better to run multiple VMs on a single 500gb server. you better use 64gb-128gb VMs, each divided between 31gb for Elasticsearch and the rest for the file system cache.
multiple VMs on a server means that each VM is Elasticsearch node.

How to change elastic search heap size in elastic cloud?

I am using 14 trial account of elastic search. This account showing me I have a 4.6GB heap size. I want to reduce my heap size to 2GB so how I can reduce this. I have checked the way of changing the heap size using the following options:
export ES_HEAP_SIZE=2g or
ES_JAVA_OPTS="-Xms2g -Xmx2g" ./bin/elasticsearch
But How I can reduce the heap size using one of the above option in the elastic cloud?
Since Elastic Cloud is a managed service, end users do not have access to the backend master and data nodes in the cluster. Unfortunately, you cannot change the heap size setting of an Elastic Cloud Elasticsearch cluster. You can however scale down your cluster so that the heap memory allocated also reduces. Alternatively, you could try emailing Elastic support at support#elastic.co and ask if they can change the heap size for you but I highly doubt that level of customization is offered for Elastic Cloud service.

Changing AWS Elasticsearch properties (without elasticsearch.yml) like threadpool queue size

I would like to change my AWS Elasticsearch thread_pool.write.queue_size setting. I see that the recommended technique is to update the elasticsearch.yml file as it can't be done dynamically by the API in the newer versions.
However, since I am using AWS's Elasticsearch service, as far as I'm aware, I don't have access to that file. Is there anyway to make this change? I don't see it referenced for version 6.3 here so I don't know how to do it with AWS.
You do not have a lot of flexibility with AWS ES. In your case, scale your data node instance type to a bigger instance and that should provide you higher thread pool queue size. A note on increasing the number of shards - do not do it unless really required as it may cause performance issues while searching, aggregating etc. A shard can easily hold upto 50 GB of data, so if you have a lot of shards with very less data then think about shrinking the shards. Each shard in itself consumes resources (cpu, memory) etc and shard configuration should be proportional to the heap memory available on the node.

Memory Management in H2O

I am curious to know how memory is managed in H2O.
Is it completely 'in-memory' or does it allow swapping in case the memory consumption goes beyond available physical memory? Can I set -mapperXmx parameter to 350GB if I have a total of 384GB of RAM on a node? I do realise that the cluster won't be able to handle anything other than the H2O cluster in this case.
Any pointers are much appreciated, Thanks.
H2O-3 stores data completely in-memory in a distributed column-compressed distributed key-value store.
No swapping to disk is supported.
Since you are alluding to mapperXmx, I assume you are talking about running H2O in a YARN environment. In that case, the total YARN container size allocated per node is:
mapreduce.map.memory.mb = mapperXmx * (1 + extramempercent/100)
extramempercent is another (rarely used) command-line parameter to h2odriver.jar. Note the default extramempercent is 10 (percent).
mapperXmx is the size of the Java heap, and the extra memory referred to above is for additional overhead of the JVM implementation itself (e.g. the C/C++ heap).
YARN is extremely picky about this, and if your container tries to use even one byte over its allocation (mapreduce.map.memory.mb), YARN will immediately terminate the container. (And for H2O-3, since it's an in-memory processing engine, the loss of one container terminates the entire job.)
You can set mapperXmx and extramempercent to as large a value as YARN has space to start containers.

Resources