heap memory overflow master nodes (continuous GC) - Elastic Search - elasticsearch

Recently, encountered increases in the heap memory usage in the master nodes (heap memory overflow master nodes continuous garbage collection ). I try to debug the root cause using the heap dump saved in the storage ( sample file name for reference: java_pid1.hprof ) but those files are encrypted unable find anything.
Is this the correct way to debug the heap memory issue,
If yes, how to get the decrypted heap dump to get a proper info
Else how to debug the heap memory issue in the master node
Elastic Search Info:
Running in Kubernetes
Dedicated 3 master nodes
3 data nodes (which are also the ingest nodes)
3 data nodes - each node spec(ram 64GB memory limit 32GB) - heap size - 28GB disk size - 1TB
3 master nodes - each node spec(ram 16GB memory limit 4GB) - heap size - 4GB disk size - 10GB

Hprof files can be opened inside Eclipse. Eclipse has a special plugin to open hprof files. Its called the memory analyzer tool.
I have done these excercises in the past, but usually you find nothing much there.
Thanks.

Related

maxtimeout Reached - Indexing a document Elastic Search

recently when I try to index a document. The request, response with max-timeout reached
after a certain point in time, and it starts indexing again.
Now, I'm trying to find the root cause of that issue. The only thing I'm able to find is one of my master nodes was down at that time. will it result in that timeout issue?
the infra details of my elastic search are:
run in Kubernetes
3 data nodes - each node spec(ram 64GB memory limit 32GB) - heap size - 28GB disk size - 1TB
3 master nodes - each node spec(ram 16GB memory limit 4GB) - heap size - 4GB disk size - 10GB
Found the cause of it
which is due to all masters being down at that time
because of multiple issues:
heap dump saving (out of mem in storage)
because of storage sharing trying to dump heap data with the same name(which throws file exist error)

how can I upload and parse big files from logstash to Elasticsearch

I have a 3 nodes cluster with 1 master and 2 data nodes each is set for 1TB
I have increased both -Xms24g -Xmx24g to half my ram (48GB total)
I than successfully upload 140mb file from Kibana to elk from the GUI after increasing it from 100mb to 1GB
when I tried to upload same file with only logstash the process was stuck and broke elastic
my pipeline is fairly simple
input {
file {
path => "/tmp/*_log"
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}
small files works great. I'm not able to push big files.
log contains 1 million rows
I set all fields in /etc/security/limits.conf to unlimited
any ideas what I'm missing?
you will need to increase memory sizing in /etc/logstash/jvm.options
The recommended heap size for typical ingestion scenarios should be no less than 4GB and no more than 8GB.
CPU utilization can increase unnecessarily if the heap size is too low, resulting in the JVM constantly garbage collecting. You can check for this issue by doubling the heap size to see if performance improves.
Do not increase the heap size past the amount of physical memory. Some memory must be left to run the OS and other processes. As a general guideline for most installations, don’t exceed 50-75% of physical memory. The more memory you have, the higher percentage you can use.
Set the minimum (Xms) and maximum (Xmx) heap allocation size to the same value to prevent the heap from resizing at runtime, which is a very costly process.
You can make more accurate measurements of the JVM heap by using either the jmap command line utility distributed with Java or by using VisualVM

heap size when running elasticsearch cluster on kubernetes

I am running elasticsearch cluster on Kubernetes cluster. I have a need to increase the heap size. Right now the heap size is 4gb and memory allocated to the pod is 8gb. When setting up elastic search cluster on VMs/BMs I have always followed the principle that heap size should not be more than 50% of physical RAM. Please follow link https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
Now my question is, will this principle work in the same manner when running elastic search on K8 or how to decide the heap size when running ES on K8?
The heap size should be half the size of RAM allocated to pods.
From elasticsearch guide:
The heap size should be half the size of RAM allocated to the Pod. To
minimize disruption caused by Pod evictions due to resource
contention, you should run Elasticsearch pods at the "Guaranteed" QoS
level by setting both requests and limits to the same value.

HDFS Data Write Process for different disk size nodes

We have 10 node HDFS (Hadoop - 2.6, cloudera - 5.8) cluster, and 4 are of disk size - 10 TB and 6 node of disk size - 3TB. In that case, Disk is constantly getting full on small size disk nodes, however the disk is free available on high disk size nodes.
I tried to understand, how namenode writes data/block to different disk size nodes. whether it is equally divided or some percentage of data getting written.
You should look at dfs.datanode.fsdataset.volume.choosing.policy. By default this is set to round-robin but since you have an asymmetric disk setup you should change it to available space.
You can also fine tune disk usage with the other two choosing properties.
For more information see:
https://www.cloudera.com/documentation/enterprise/5-8-x/topics/admin_dn_storage_balancing.html

what's the actual ideal NameNode memory size when meet a lot files in HDFS

I will have 200 million files in my HDFS cluster, we know each file will occupy 150 bytes in NameNode memory, plus 3 blocks so there are total 600 bytes in NN.
So I set my NN memory having 250GB to well handle 200 Million files. My question is that so big memory size of 250GB, will it cause too much pressure on GC ? Is it feasible that creating 250GB Memory for NN.
Can someone just say something, why no body answer??
Ideal name node memory size is about total space used by meta of the data + OS + size of daemons and 20-30% space for processing related data.
You should also consider the rate at which data comes in to your cluster. If you have data coming in at 1TB/day then you must consider a bigger memory drive or you would soon run out of memory.
Its always advised to have at least 20% memory free at any point of time. This would help towards avoiding the name node going into a full garbage collection.
As Marco specified earlier you may refer NameNode Garbage Collection Configuration: Best Practices and Rationale for GC config.
In your case 256 looks good if you aren't going to get a lot of data and not going to do lots of operations on the existing data.
Refer: How to Plan Capacity for Hadoop Cluster?
Also refer: Select the Right Hardware for Your New Hadoop Cluster
You can have a physical memory of 256 GB in your namenode. If your data increase in huge volumes, consider hdfs federation. I assume you already have multi cores ( with or without hyperthreading) in the name node host. Guess the below link addresses your GC concerns:
https://community.hortonworks.com/articles/14170/namenode-garbage-collection-configuration-best-pra.html

Resources