Why result of MemoryPoolMXBean.MemoryUsage.getUsed for old generation decrease without mixed GC under G1? - g1gc

I can sure no mixed GC or full GC happened through GC log, but value of G1 Old Gen from MemoryUsage.getUsed decreased. From the following picture we can find that the old generation usage repeat grow very slowly and then decrease to a low value(It seems the JVM do a mixed GC).
So my question is that whether young GC affect old generation usage?

Related

Recommended Java Heap Size for Commercial JMeter Project

Depending on the nature of the automated workflow and the number of active threads at any given time the Heap size requirement for JMeter can vary and in the testing I am doing there is some ambiguity with respect to the affect of Heap size on the test results. The initial Heap size and the maximum Heap size of the server hosting JMeter is shown in the attached screenshot.
Upon executing the test for a large set of current users (eg:100) the in built JMeter report does not render however the results can be seen in the CSV output. Will increasing the Heap size solve this issue and if so to how much should we increase the Heap size?. Note that this issue does not happen for a small user count such as 10 or 15.
What is the recommended industrial standard value for Heap size and other system variables for a server used for commercial performance testing using JMeter.
There is no "recommended industrial standard".
Each test is individual and you need to tune JMeter appropriately.
As of JMeter 5.5 default heap size is 1GB which is sufficient for tests development and debugging but might be not sufficient for the load you're trying to conduct.
According to this article:
"If the occupancy of the Java heap is too high, garbage collection occurs frequently. If the occupancy is low, garbage collection is infrequent but lasts longer... Try to keep the memory occupancy of the Java heap between 40% and 70% of the Java heap size... The highest point of occupancy of the Java heap is preferably not above 70% of the maximum heap size, and the average occupancy is between 40% and 70% occupancy. If the occupancy goes over 70%, resize the Java heap."
So I would recommend checking what's going on with your heap using JVisualVM or equivalent and adjusting it up or down as needed.
If your test runs fine and you're experiencing OOM issues only during dashboard generation you can increase it temporarily by setting the relevant HEAP environment variable value.

Why Go can lower GC pauses to sub 1ms and JVM has not?

So there's that: https://groups.google.com/forum/?fromgroups#!topic/golang-dev/Ab1sFeoZg_8:
Today I submitted changes to the garbage collector that make typical worst-case stop-the-world times less than 100 microseconds. This should particularly improve pauses for applications with many active goroutines, which could previously inflate pause times significantly.
High GC pauses are one of the things JVM users struggle with for a long time.
What are the (architectural?) constraints which prevent JVM from lowering GC pauses to Go levels, but are not affecting Go?
2021 Update: With OpenJDK 16 ZGC now has a max pause time of <1ms and average pause times 50µs
It achieves these goals while still performing compaction, unlike Go's collector.
Update: With OpenJDK 17 Shenandoah exploits the same techniques introduced by ZGC and achieves similar results.
What are the (architectural?) constraints which prevent JVM from lowering GC pauses to golang levels
There aren't any fundamental ones as low-pause GCs have existed for a while (see below). So this may be more a difference of impressions either from historic experience or out-of-the-box configuration rather than what is possible.
High GC pauses are one if the things JVM users struggle with for a long time.
A little googling shows that similar solutions are available for java too
Azul offers a pauseless collector that scales even to 100GB+
Redhat is contributing shenandoah to openjdk and oracle zgc.
IBM offers metronome, also aiming for microsecond pause times
various other realtime JVMs
The other collectors in openjdk are, unlike Go's, compacting generational collectors. That is to avoid fragmentation problems and to provide higher throughput on server-class machines with large heaps by enabling bump pointer allocation and reducing the CPU time spent in GC. And at least under good conditions CMS can achieve single-digit millisecond pauses, despite being paired with a moving young-generation collector.
Go's collector is non-generational, non-compacting and requires write barriers (see this other SO question), which results in lower throughput/more CPU overhead for collections, higher memory footprint (due to fragmentation and needing more headroom) and less cache-efficient placement of objects on the heap (non-compact memory layout).
So GoGC is mostly optimized for pause time while staying relatively simple (by GC standards) at the expense of several other performance and scalability goals.
JVM GCs make different tradeoffs. The older ones often focused on throughput. The more recent ones achieve low pause times and several other goals at the expense of higher complexity.
According to this presentation, Getting to Go: The Journey of Go's Garbage Collector, the Go collectors only utilize half of the heap for live data:
Heap 2X live heap
My impression is that Java GCs generally aim for higher heap utilization, so they make a very different trade-off here.

How fast is the go 1.5 gc with terabytes of RAM?

Java cannot use terabytes of RAM because the GC pause is way too long (minutes). With the recent update to the Go GC, I'm wondering if its GC pauses are short enough for use with huge amounts of RAM, such as a couple of terabytes.
Are there any benchmarks of this yet? Can we use a garbage-collected language with this much RAM now?
tl;dr:
You can't use TBs of RAM with a single Go process right now. Max is 512 GB on Linux, and most that I've seen tested is 240 GB.
With the current background GC, GC workload tends to be more important than GC pauses.
You can understand GC workload as pointers * allocation rate / spare RAM. Of apps using tons of RAM, only those with few pointers or little allocation will have a low GC workload.
I agree with inf's comment that huge heaps are worth asking other folks about (or testing). JimB notes that Go heaps have a hard limit of 512 GB right now, and 18 240 GB is the most I've seen tested.
Some things we know about huge heaps, from the design document and the GopherCon 2015 slides:
The 1.5 collector doesn't aim to cut GC work, just cut pauses by working in the background.
Your code is paused while the GC scans pointers on the stack and in globals.
The 1.5 GC has a short pause on a GC benchmark with a roughly 18GB heap, as shown by the rightmost yellow dot along the bottom of this graph from the GopherCon talk:
Folks running a couple production apps that initially had about 300ms pauses reported drops to ~4ms and ~20ms. Another app reported their 95th percentile GC time went from 279ms to ~10ms.
Go 1.6 added polish and pushed some of the remaining work to the background. As a result, tests with heaps up to a bit over 200GB still saw a max pause time of 20ms, as shown in a slide in an early 2016 State of Go talk:
The same application that had 20ms pause times under 1.5 had 3-4ms pauses under 1.6, with about an 8GB heap and 150M allocations/minute.
Twitch, who use Go for their chat service, reported that by Go 1.7 pause times had been reduced to 1ms with lots of running goroutines.
1.8 took stack scanning out of the stop-the-world phase, bringing most pauses well under 1ms, even on large heaps. Early numbers look good. Occasionally applications still have code patterns that make a goroutine hard to pause, effectively lengthening the pause for all other threads, but generally it's fair to say the GC's background work is now usually much more important than GC pauses.
Some general observations on garbage collection, not specific to Go:
The frequency of collections depends on how quickly you use up the RAM you're willing to give to the process.
The amount of work each collection does depends in part on how many pointers are in use.
(That includes the pointers within slices, interface values, strings, etc.)
Rephrased, an application accessing lots of memory might still not have a GC problem if it only has a few pointers (e.g., it handles relatively few large []byte buffers), and collections happen less often if the allocation rate is low (e.g., because you applied sync.Pool to reuse memory wherever you were chewing through RAM most quickly).
So if you're looking at something involving heaps of hundreds of GB that's not naturally GC-friendly, I'd suggest you consider any of
writing in C or such
moving the bulky data out of the object graph. For example, you could manage data in an embedded DB like bolt, put it in an outside DB service, or use something like groupcache or memcache if you want more of a cache than a DB
running a set of smaller-heap'd processes instead of one big one
just carefully prototyping, testing, and optimizing to avoid memory issues.
The new Java ZGC garbage collector can now use 16 Terrabytes of memory and garbage collect in under 10ms.

Hadoop DataNode memory consumption and GC behaviour

Recently we have been running into issues with our cluster (CDH 5.3.1), that manifested in both the NameNodes as well as the DataNodes being stuck in long GC cycles varying from 30 sec up to several minutes.
The JVM settings were still the default ones but given that our cluster has in the meanwhile grown to 34 million blocks, the behaviour was explainable.
For the NN a simple adjustment of the heapsize and other minor adjustment to the GC settings (e.g. young gen size, survivorratio) has gotten us predictable short GC pauses again.
For the DN however we are still suffering from periodic long GC pauses. What I observe is that exceptionally long GC pauses occur every 6 hours (Full GC). Now I assume that Cloudera setting a default of 6 h for the blockreport interval dfs.blockreport.intervalMsec is contributing to this pattern.
What I'd like to understand is if there are suggestions how I can approach this problem, where I need to find GC settings that both cater for normal operation memory allocation (seems to be mostly fine) as well as the rapid allocation I'm seeing every 6 hours for a few minutes.
The DN servers have 256G RAM & 20 physical cores
This is Java Hotspot jdk1.7.0_67.
My current, suboptimal settings are:
-server
-Xmn5g
-Xms12884901888
-Xmx12884901888
-XX:SurvivorRatio=3
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSConcurrentMTEnabled
-XX:CMSInitiatingOccupancyFraction=60
-XX:+CMSParallelRemarkEnabled
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+ScavengeBeforeFullGC
-XX:+CMSScavengeBeforeRemark
-XX:MaxTenuringThreshold=15
I'd also be interested to hear if instead of tweaking the JVM there is also a way to influence the blockreport to be less aggressive?
See gc log for the time-range in question:
http://hastebin.com/zafabohowi
Ok, running the log through GCViewer it simply seems like there's a burst of activity (e.g. starting 17:09) that fills up the old generation until it causes some failures (at 17:15).
Simply try bumping the heap size to give it more breathing room until the task has finished.
Beyond the concurrent mode failure there still seem to be some relatively long pauses, try applying these options to see if they can shave off some milliseconds.

Java heap bottleneck - how to identify the cause?

I have a J2EE project running on JBoss, with a maximum heap size of 2048m, which is giving strange results under load testing. I've benchmarked the heap and cpu usage and received the following results (series 1 is heap usage, series 2 is cpu usage):
It seems as if the heap is being used properly and getting garbage collected properly around A. When it gets to B however, there appears to be some kind of a bottleneck as there is heap space available, but it never breaks that imaginary line. At the same time, at C, the cpu usage drops dramatically. During this period we also receive an "OutOfMemoryError (GC overhead limit exceeded)," which does not make much sense to me as there is heap space available.
My guess is that there is some kind of bottleneck, but what exactly I can't even imagine. How would you suggest going about finding the cause of the issue? I've profiled the memory usage and noticed that there are quite a few instances of the one class (around a million), but the total size of these instances is fairly small (around 50MB if I remember correctly).
Edit: The server is dedicated to to this application and the CPU usage given is only for the JVM (there should not be any significant CPU usage outside of the JVM). The memory usage is only for the heap, it does not include the permgen space. This problem is reproducible. My main concern is surrounding the limit encountered around B, for which I have not found a plausible explanation yet.
Conclusion: Turns out this was caused by a bunch of long running SQL queries being called concurrently. The returned ResultSets were also very large, possibly explaining the OOME. I still have no reasonable explanation for why there appears to be some limit at B.
From the error message it appears that the JVM is using the parallel scavenger algorithm for garbage collection. The message is dumped along with an OOME error when a lot of time is spent on GC, but not a lot of the heap is recovered.
The document from Sun does not specify if the 98% of the total time consumed is to be read as 98% of the CPU utilization of the process or that of the CPU itself. In either case, I have to draw the following inferences (with limited information):
The garbage collector or the JVM process does not have enough CPU utilization, most likely due to other processes consuming CPU at the same time.
The garbage collector does not have enough CPU utilization since it is a low priority thread, and another memory intensive (but not CPU intensive) thread in the JVM is doing work at the same time, which results in the failure to de-allocate memory.
Based on the above inferences (all, one or none of them could be true), it would be worthwhile to correlate the graph that you're obtained with the runtime behavior of the application as far as users are concerned. In other words, you might find it useful to determine if other processes are kicked off (when your problem occurs), or the part of the application that is in operation (again, when the problem occurs).
In any case, the page referenced above, does give an option to disable the GC overhead limit used by the GC algorithm.
EDIT: If the problem occurs periodically, and can be reproduced, it might turn out to be a memory leak, otherwise (i.e. it occurs sporadically), you are better off tuning the GC algorithm or even changing it.
If I want to know where the "bottlenecks" are, I just get a few stackshots. There's no need to wonder and guess and play detective. They will just tell you.
Usually memory problems and performance problems go hand in hand, so if you fix the performance problems, you will also fix the memory problems (not for certain, though).

Resources