When we run elasticsearch in server then we face Broken pipe issue in elasticsearch.
"org.apache.catalina.connector.ClientAbortException: java.io.IOException: Broken pipe"
We just increase the heap memory of the elasticsearch as given step.
First check current heap memory of elasticsearch.
ps aux | grep elasticsearch
"-Xms1g -Xmx1g"
Increase the size of heap Memory
vi /etc/sysconfig/elasticsearch
**Heap size defaults to 256m min, 1g max
Set ES_HEAP_SIZE to 50% of available RAM, but no more than 31g**
ES_HEAP_SIZE=3g
Check new heap memory
ps aux | grep elasticsearch
"-Xms3g -Xmx3g"
Related
What does mean Mem Avail in yarn UI?
I set yarn.scheduler.minimum-allocation-mb to 1024 and yarn.scheduler.maximum-allocation-mb to 4096. yarn.nodemanager.resource.memory-mb is also set to -1 as default. I can see the memory is free in every nodes and UI show that Phys Mem Used is just 14%. However, the Mem Avail is 0 B and I don't know what is it and how to increase it.
I found the answer!
It's equal to yarn.nodemanager.resource.memory-mb which is The total amount of memory that YARN can use on a given node. You might need to set it higher inside yarn-site.xml depending on the amount of data you plan on processing.
The default value of this config is 8GB, although with getconf command you will see -1 which doesn't mean total memory of the system.
Before:
$ hdfs getconf -confKey yarn.nodemanager.resource.memory-mb
-1
After set it in yarn-site.xml:
$ hdfs getconf -confKey yarn.nodemanager.resource.memory-mb
40960
The result:
When I start elasticseach, I am getting this Warning:
[2018-08-05T15:04:27,370][WARN ][o.e.b.BootstrapChecks ] [bDyfvVI] max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
I have set the needed value to 65536 as running through this tutorial https://www.elastic.co/guide/en/elasticsearch/reference/current/file-descriptors.html. I have also tried these steps:
Check ulimit -n, it would be 4096.
Edit /etc/security/limits.conf and add following lines:
* soft nofile 65536
* hard nofile 65536
root soft nofile 65536
root hard nofile 65536
Edit /etc/pam.d/common-session and add this line session required pam_limits.so
Edit /etc/pam.d/common-session-noninteractive and add this line session required pam_limits.so
Reload session and check for ulimit -n, it would be 65536.
Unfortunately I am still getting this warning. Can someone help me why?
We've raised the MAX_OPEN_FILES set to 1024000 by changing the value in
/etc/default/elasticsearch
More information here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/file-descriptors.html
Elasticsearch uses a lot of file descriptors or file handles. Running out of file descriptors can be disastrous and will most probably lead to data loss. Make sure to increase the limit on the number of open files descriptors for the user running Elasticsearch to 65,536 or higher.
For the .zip and .tar.gz packages, set ulimit -n 65535 as root before starting Elasticsearch, or set nofile to 65535 in /etc/security/limits.conf.
On macOS, you must also pass the JVM option -XX:-MaxFDLimit to Elasticsearch in order for it to make use of the higher file descriptor limit.
RPM and Debian packages already default the maximum number of file descriptors to 65535 and do not require further configuration.
You can check the max_file_descriptors configured for each node using the Nodes Stats API, with:
GET _nodes/stats/process?filter_path=**.max_file_descriptors
I have packaged the eureka server provided by this repository https://github.com/spring-cloud-samples/eureka and tried to launch it on a cluster installation managed with Marathon/Mesos on which contraints on memory are set.
Nevertheless if I start the app in Marathon with 512MB it takes 100 seconds (each slaves have 32GB of RAM) to start instead of 12 seconds on my mac (16GB of RAM).
Even with configuring the Xms and Xms does not solve the issue. Using 256MB is even worse.
We have found this but no sure it can be applied to tomcat launched by spring boot
Inspired by
https://community.alfresco.com/docs/DOC-4914-jvm-tuning#w_generalcase
https://stackoverflow.com/a/33985214
Total memory in MB
TOTAL_MEM_KB=`free | awk '/^Mem:/{print $2}'`
Processor count
CPU_COUNT=`grep '^processor\s:' /proc/cpuinfo | wc -l`
Take half memory for Xmx setting in GB
XMS=`expr ${TOTAL_MEM_KB} / 1000 / 1000 / 2`
MaxPerm hardcoded to 256M
MAX_PERM_MB="256"
HeapRegion is lower int round of Xmx/2048
G1_HEAP=`expr ${XMS} \* 1000 / 2048`
ParallelGC is half count of CPU (rounded to lower int)
PARA_GC=`expr ${CPU_COUNT} / 2`
ConcurrentGC is half ParallelGC
CONC_GC=`expr ${PARA_GC} / 2`
JAVA_MEM="-Xmx${XMS}g -XX:MaxPermSize=${MAX_PERM_MB}M -XX:+UseG1GC -XX:MaxGCPauseMillis=1000 -XX:G1HeapRegionSize=${G1_HEAP} -XX:ParallelGCThreads=${PARA_GC} -XX:ConcGCThreads=${CONC_GC}"
I have a hadoop cluster we assuming is performing pretty "bad". The nodes are pretty beefy.. 24 cores, 60+G RAM ..etc. And we are wondering if there are some basic linux/hadoop default configuration that prevent hadoop from fully utilizing our hardware.
There is a post here that described a few possibilities that I think might be true.
I tried logging in the namenode as root, hdfs and also myself and trying to see the output of lsof and also the setting of ulimit. Here are the output, can anyone help me understand why the setting doesn't match with the open files number.
For example, when I logged in as root. The lsof looks like this:
[root#box ~]# lsof | awk '{print $3}' | sort | uniq -c | sort -nr
7256 cloudera-scm
3910 root
2173 oracle
1886 hbase
1575 hue
1180 hive
801 mapred
470 oozie
427 yarn
418 hdfs
244 oragrid
241 zookeeper
94 postfix
87 httpfs
...
But when I check out the ulimit output, it looks like this:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 806018
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I am assuming, there should be no more than 1024 files opened by one user, however, when you look at the output of lsof, there are 7000+ files opened by one user, can anyone help explain what is going on here?
Correct me if I had made any mistake understanding the relation between ulimit and lsof.
Many thanks!
You need to check limits for the process. It may be different from your shell session:
Ex:
[root#ADWEB_HAPROXY3 ~]# cat /proc/$(pidof haproxy)/limits | grep open
Max open files 65536 65536 files
[root#ADWEB_HAPROXY3 ~]# ulimit -n
4096
In my case haproxy has a directive on its config file to change maximum open files, there should be something for hadoop as well
I had a very similar issue, which caused one of the claster's YARN TimeLine server to stop due to reaching magical 1024 files limit and crashing with "too many open files" errors.
After some investigation it came out that it had some serious issues with dealing with too many files in TimeLine's LevelDB. For some reason YARN ignored yarn.timeline-service.entity-group-fs-store.retain-seconds setting (by default it's set to 7 days, 604800ms). We had LevelDB files dating back for over a month.
What seriously helped was applying a fix described in here: https://community.hortonworks.com/articles/48735/application-timeline-server-manage-the-size-of-the.html
Basically, there are a couple of options I tried:
Shrink TTL (time to live) settings First enable TTL:
<property>
<description>Enable age off of timeline store data.</description>
<name>yarn.timeline-service.ttl-enable</name>
<value>true</value>
</property>
Then set yarn.timeline-service.ttl-ms (set it to some low settings for a period of time):
\
<property>
<description>Time to live for timeline store data in milliseconds.</description>
<name>yarn.timeline-service.ttl-ms</name>
<value>604800000</value>
</property>
Second option, as described, is to stop TimeLine server, delete the whole LevelDB and restart the server. This will start the ATS database from scratch. Works fine if you failed with any other options.
To do it, find the database location from yarn.timeline-service.leveldb-timeline-store.path, back it up and remove all subfolders from it. This operation will require root access to the server where TimeLine is located.
Hope it helps.
I am seeing this when i do ps -aef | grep elasticsearch
HeapDumpOnOutOfMemoryError
501 37347 1 0 2:29PM ttys004 0:04.14 /usr/bin/java -Xms4g
-Xmx4g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.path.home=/Users/abdullahmuhammad/elasticsearch -cp :/Users/abdullahmuhammad/elasticsearch/lib/elasticsearch-0.20.6.jar:/Users/abdullahmuhammad/elasticsearch/lib/:/Users/abdullahmuhammad/elasticsearch/lib/sigar/
org.elasticsearch.bootstrap.ElasticSearch
I have tried a few things. Playing with the size of initial heap. Increasing, decreasing it.
I have also deleted my whole index but still i get no luck.
I used following to delete the index.
curl -XDELETE 'http://localhost:9200/_all/'
Any help would be appreciated.
If you use some plugins like Marvel you should check indexes count and their size. Because some plugins create big number of indixes and they can eat all your memory.
For the heap, Elasticsearch recommands 50% of your available memory.
General, Elasticsearch recommandations for memory: max. 64GB, min. 8GB.
Important documentation:
https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html
https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html
A few recommendations:
- Adjust your ES_HEAP_SIZE environment variable.
- Set mlockall option (in config file) of ES to true. This will always allocate a concrete block of heap memory.
- If your system is not very strong, you decrease your shard number. Note that; while increasing the number of shards increases the insert performance, increasing number of replication increases the query performance.