Out of Memory : Metaspace with java 8 - java-8

My product has 256 MB of RAM. I have upgrade java 6 to java 8.
Then I started facing memory related issue with java 8.
memory consumption is incresing by the time with Java 8
With same code, memory consumption is stable with java 6
I have explored so much regarding metaspace,java8 and found below metaspace parameters.
I have tried below combinations and got out of memory error
1.MaxMetaspaceSize 50M
MaxMetaspaceFreeRatio 60M
MinMetaspaceFreeRatio 50M
2.MaxMetaspaceSize 30M
3.MaxMetaspaceSize 40M
4.MaxMetaspaceSize 50M
5.MaxMetaspaceSize 80M
But after 2 to 3 hours ,outofmemory metaspace error raised.
Can someone explain what the metaspace options MaxMetaspaceSize,MaxMetaspaceFreeRatio ,MinMetaspaceFreeRatio are?
How do I decide what the right size is?
what is the correct combination of these values to avoid outofmemory instance in production and reduce memory consumption ?

The issue is resolved.
Found that there are known issue with JAXB.
Used stringbuffer instead of string
Removed unnecessary instance of JAXBContext.
Refer this link
Are there any memory utlization issue with JAXB?

Related

Pod sizing with Actuator metrics jvm.memory.max

I am trying to size our pods using the actuator metrics info. With the below K8 resource quota configuration;
resources:
requests:
memory: "512Mi"
limits:
memory: "512Mi"
We are observing that jvm.memory.max returns ~1455 mb. I understand that this value includes heap and non-heap. Further drilling into the api (jvm.memory.max?tag=area:nonheap) and (jvm.memory.max?tag=area:heap) results in ~1325mb and ~129mb respectively.
Obviously with the non-heap set to max out at a value greater than the K8 limit, the container is bound to get killed eventually. But why is the jvm (non-heap memory) not bounded by the memory configuration of the container (configured in K8)?
The above observations are valid with java 8 and java 11. The below blog discusses the experimental options with java 8 where CPU and heap configurations are discussed but no mention of non-heap. What are some suggestions to consider in sizing the pods?
-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap
Source
Java 8 has a few flags that can help the runtime operate in a more container aware manner:
java -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -jar app.jar
Why you get maximum JVM heap memory of 129 MB if you set the maximum container memory limit to 512 MB? So the answer is that memory consumption in JVM includes both heap and non-heap memory. The memory required for class metadata, JIT complied code, thread stacks, GC, and other processes is taken from the non-heap memory. Therefore, based on the cgroup resource restrictions, the JVM reserves a portion of the memory for non-heap use to ensure system stability.
The exact amount of non-heap memory can vary widely, but a safe bet if you’re doing resource planning is that the heap is about 80% of the JVM’s total memory. So if you set the set maximum heap to 1000 MB, you can expect that the whole JVM might need around 1250 MB.
The JVM read that the container is limited to 512M and created a JVM with maximum heap size of ~129MB. Exactly 1/4 of the container memory as defined in the JDK ergonomic page.
If you dig into the JVM Tuning guide you will see the following.
Unless the initial and maximum heap sizes are specified on the command line, they're calculated based on the amount of memory on the machine. The default maximum heap size is one-fourth of the physical memory while the initial heap size is 1/64th of physical memory. The maximum amount of space allocated to the young generation is one third of the total heap size.
You can find more information about it here.

Applications is slowing with metaspace growth

I just moved from grails 2.4.3 to grails 2.5.6 and from Java 7 to Java 8.
I'm trying to set optimal metaspace size in my app.
Actual metaspace size has big impact on application performance:
Used metaspace and response avg time:
200 MB - 339 ms
300 MB - 380 ms
400 MB - 430 ms
500 MB - 460 ms
600 MB - 530 ms
Metaspace is growing from application start to 620MB in 90 minutes.
This is my actual gc settings:
-Xms14G -Xmx14G\
-XX:+UseG1GC\
-XX:ParallelGCThreads=8\
-XX:ConcGCThreads=4\
-XX:MaxGCPauseMillis=200\
-XX:+UseLargePages\
-XX:+UseLargePagesInMetaspace\
-XX:+AlwaysPreTouch\
-XX:InitialBootClassLoaderMetaspaceSize=512M\
-XX:MetaspaceSize=512M\
-XX:MinMetaspaceExpansion=8M\
-XX:MaxMetaspaceExpansion=32M\
-XX:+UseStringDeduplication\
-XX:+ParallelRefProcEnabled\
-XX:-TieredCompilation\
When MaxMetaspaceSize was set to 512M then after few hours of running my app is slowing down 1 or 2 times for hour. Respons time is around 10 seconds then.
Anyone had such problem? In yours applications metaspace has such impact on performance?
Have you profiled your app?
I'd say that Metaspace garbage collection was involved here. It collects dead classes and classloaders and it's triggered once the class metadata usage reaches the MaxMetaspaceSize (which was narrowed by -XX:MaxMetaspaceSize).

Need help understanding my ElasticSearch Cluster Health

When querying my cluster, I noticed these stats for one of my nodes in the cluster. Am new to Elastic and would like the community's health in understanding the meaning of these and if I need to take any corrective measures?
Does the Heap used look on the higher side and if yes, how would I rectify it? Also any comments on the System Memory Used would be helpful - it feels like its on the really high side as well.
These are the JVM level stats
JVM
Version OpenJDK 64-Bit Server VM (1.8.0_171)
Process ID 13735
Heap Used % 64%
Heap Used/Max 22 GB / 34.2 GB
GC Collections (Old/Young) 1 / 46,372
Threads (Peak/Max) 163 / 147
This is the OS Level stats
Operating System
System Memory Used % 90%
System Memory Used 59.4 GB / 65.8 GB
Allocated Processors 16
Available Processors 16
OS Name Linux
OS Architecture amd64
As You state that you are new to Elasticsearch I must say you go through cluster as well as cat API you can find documentation at clusert API and cat API
This will help you understand more in depth.

ElasticSearch 30.5GB Heap Size Restriction

We use ES to store around 2.5TB of data. We have 12 primary shards and 2 replica shards.
We are currently load testing ES and I read the following article
https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html
This article states 2 important things. First allocate 50% of Memory to Lucene and Second Don't cross 30.5GB limit for heap space.
I don't clearly understand the 30.5GB limit. I understand that if I am to set 40GB over 30.5 GB i will loose more than i gain(because of compressed pointers) but say if i have hardware of around 250GB RAM what are the reasons that i should only allocate 30.5GB and not 120GB for heap. Won't i start seeing gains after 70-80GB heap setting over 30.5 GB heap. Can somebody list down all the reasons?

Adobe Experience Manager (AEM), Java garbage collection tuning and memory management

I am currently using the Adobe Experience Manager for a Client's site (Java language). It uses openJDK:
#java -version
java version "1.7.0_65"
OpenJDK Runtime Environment (rhel-2.5.1.2.el6_5-x86_64 u65-b17)
OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
It is running on Rackspace with the following:
vCPU: 4
Memory: 16GB
Guest OS: Red Hat Enterprise Linux 6 (64-bit)
Since it has been in production I have been experiencing very slow performance on the part of the application. It goes like this I launch the app, everything is smooth then 3 to 4 days later the CPU usage spikes to 400% (~4000 users/day hit the site). I got a few OOM exceptions (1 or 2) but mostly the site was exceptionally slow and never becomes an OOM exception. Since I am a novice at Java Memory management I started reading about how it works and found tools like jstat. When the system was overwhelmed the second time around, I ran:
#top
Got the PID of the java process and then pressed shift+H and noted the PIDs of the threads with high CPU percentage. Then I ran
#sudo -uaem jstat <PID>
Got a thread dump and converted the thread PIDs I wrote down previously and searched for their hex value in the dump. After all that, I finally found that it was not surprisingly the Garbage Collector that is flipping out for some reason.
I started reading a lot about Java GC tuning and came up with the following java options.
So restarted the application with the following options:
java
-Dcom.day.crx.persistence.tar.IndexMergeDelay=0
-Djackrabbit.maxQueuedEvents=1000000
-Djava.io.tmpdir=/srv/aem/tmp/
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/srv/aem/tmp/
-Xms8192m -Xmx8192m
-XX:PermSize=256m
-XX:MaxPermSize=1024m
-XX:+UseParallelGC
-XX:+UseParallelOldGC
-XX:ParallelGCThreads=4
-XX:NewRatio=1
-Djava.awt.headless=true
-server
-Dsling.run.modes=publish
-jar crx-quickstart/app/cq-quickstart-6.0.0-standalone.jar start
-c crx-quickstart -i launchpad -p 4503
-Dsling.properties=conf/sling.properties
And it looks like it is performing much better but I think that it probably needs more GC tuning.
When I run:
#sudo -uaem jstat <PID> -gcutils
I get this:
S0 S1 E O P YGC YGCT FGC FGCT GCT
0.00 0.00 55.97 100.00 45.09 4725 521.233 505 4179.584 4700.817
after 4 days that I restarted it.
When I run:
#sudo -uaem jstat <PID> -gccapacity
I get this:
NGCMN NGCMX NGC S0C S1C EC
4194304.0 4194304.0 4194304.0 272896.0 279040.0 3636224.0
OGCMN OGCMX OGC OC PGCMN PGCMX
4194304.0 4194304.0 4194304.0 4194304.0 262144.0 1048576.0
PGC PC YGC FGC
262144.0 262144.0 4725 509
after 4 days that I restarted it.
These result are much better than when I started but I think it can get even better. I'm not really sure what to do next as I'm no GC pro so I was wondering if you guys would have any tips or advice for me on how I could get better app/GC performance and if anything is obvious like ratio's and sizes of youngGen and oldGen ?
How should I set the survivors and eden sizes/ratios ?
Should I change GC type like use CMS GC or G1 ?
How should I proceed ?
Any advice would be helpful.
Best,
Nicola
Young and Old area ratio are interms 1:3 but it could varies depends on the application usage on
short lived objects and long lived objects. If the short lived objects are more then the
young space could be extended for example 2:3 (young:old). Reason for increase in the ratio is
to avoid scavange garbage cycle. When more short lived objects are allocated then the young space
fill fast and lead to scavenge GC cycle inturn affects the application performance. When the ratio
increased then the current value then there are possibilities in the reduction of scavenge GC cycle.
When the young space increased automatically survivor and Eden space increase accordingly.
CMS policy used to reduce pause time of the application and G1 policy targeted for larger memories
with high throughput. Gc policy can be changed based on the need of the application.
Recommended Use Cases for G1 :
The first focus of G1 is to provide a solution for users running applications that require large heaps with limited GC latency.
This means heap sizes of around 6GB or larger, and stable and predictable pause time below 0.5 seconds.
As you use 8G heap size, you can test with G1 gc policy for the same environment in order to check the GC performance.

Resources