I'm writing a library that uses the ffmpeg library interface (libavcodec, libavformat, libavutil, etc.) to manipulate some video. Part of this involves some encoding for which I'm using libx264. Everything works great but libx264 writes output to stderr, for example:
[libx264 # 0x62cbc0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
[libx264 # 0x62cbc0] profile High, level 5.0, 4:2:0, 8-bit
[libx264 # 0x62cbc0] frame I:28 Avg QP:26.78 size:144822
[libx264 # 0x62cbc0] frame P:135 Avg QP:32.21 size: 30853
[libx264 # 0x62cbc0] frame B:108 Avg QP:36.18 size: 15709
[libx264 # 0x62cbc0] consecutive B-frames: 20.3% 79.7%
[libx264 # 0x62cbc0] mb I I16..4: 9.5% 75.9% 14.6%
[libx264 # 0x62cbc0] mb P I16..4: 1.1% 1.5% 1.4% P16..4: 44.8% 9.6% 4.2% 0.0% 0.0% skip:37.5%
[libx264 # 0x62cbc0] mb B I16..4: 0.3% 0.2% 0.4% B16..8: 49.8% 4.6% 0.9% direct: 1.0% skip:42.7% L0:46.9% L1:49.8% BI: 3.2%
[libx264 # 0x62cbc0] final ratefactor: 26.71
[libx264 # 0x62cbc0] 8x8 transform intra:68.4% inter:70.4%
[libx264 # 0x62cbc0] direct mvs spatial:87.0% temporal:13.0%
[libx264 # 0x62cbc0] coded y,uvDC,uvAC intra: 76.2% 73.8% 31.6% inter: 9.7% 9.5% 4.2%
[libx264 # 0x62cbc0] i16 v,h,dc,p: 9% 57% 7% 27%
[libx264 # 0x62cbc0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 9% 21% 4% 8% 9% 7% 12% 7% 22%
[libx264 # 0x62cbc0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 13% 28% 9% 6% 6% 6% 10% 6% 16%
[libx264 # 0x62cbc0] i8c dc,h,v,p: 20% 50% 19% 11%
[libx264 # 0x62cbc0] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 # 0x62cbc0] ref P L0: 73.2% 12.6% 10.0% 2.7% 1.1% 0.3%
[libx264 # 0x62cbc0] ref B L0: 92.3% 5.6% 1.7% 0.4%
[libx264 # 0x62cbc0] kb/s:7912.01
It is not acceptable for my library to write to stdout or stderr. How can I programmatically disable this output to stderr?
I realize that I can "hijack" stderr, but this seems like a crude hack and I would strongly prefer not to do so.
x264 has command-line argument --quiet. If used as library, its equivalent is to set
param->i_log_level = X264_LOG_NONE;
Related
I have written a benchmark to compute memory bandwidth:
#include <benchmark/benchmark.h>
double sum_array(double* v, long n)
{
double s = 0;
for (long i =0 ; i < n; ++i) {
s += v[i];
}
return s;
}
void BM_MemoryBandwidth(benchmark::State& state) {
long n = state.range(0);
double* v = (double*) malloc(state.range(0)*sizeof(double));
for (auto _ : state) {
benchmark::DoNotOptimize(sum_array(v, n));
}
free(v);
state.SetComplexityN(state.range(0));
state.SetBytesProcessed(int64_t(state.range(0))*int64_t(state.iterations())*sizeof(double));
}
BENCHMARK(BM_MemoryBandwidth)->RangeMultiplier(2)->Range(1<<5, 1<<23)->Complexity(benchmark::oN);
BENCHMARK_MAIN();
I compile with
g++-9 -masm=intel -fverbose-asm -S -g -O3 -ffast-math -march=native --std=c++17 -I/usr/local/include memory_bandwidth.cpp
This produces a bunch of moves from RAM, and then some addpd instructions which perf says are hot, so I go into the generated asm and remove them, then assemble and link via
$ g++-9 -c memory_bandwidth.s -o memory_bandwidth.o
$ g++-9 memory_bandwidth.o -o memory_bandwidth.x -L/usr/local/lib -lbenchmark -lbenchmark_main -pthread -fPIC
At this point, get a perf output that I expect: Movement of data into xmm registers, increments of the pointer, and a jmp at the end of the loop:
All fine and well up to here. Now here's where things get weird:
I inquire of my hardware what the memory bandwidth is:
$ sudo lshw -class memory
*-memory
description: System Memory
physical id: 3c
slot: System board or motherboard
size: 16GiB
*-bank:1
description: DIMM DDR4 Synchronous 2400 MHz (0.4 ns)
vendor: AMI
physical id: 1
slot: ChannelA-DIMM1
size: 8GiB
width: 64 bits
clock: 2400MHz (0.4ns)
So I should be getting at most 8 bytes * 2.4 GHz = 19.2 gigabytes/second.
But instead I get 48 gigabytes/second:
-------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
-------------------------------------------------------------------------------------
BM_MemoryBandwidth/32 6.43 ns 6.43 ns 108045392 bytes_per_second=37.0706G/s
BM_MemoryBandwidth/64 11.6 ns 11.6 ns 60101462 bytes_per_second=40.9842G/s
BM_MemoryBandwidth/128 21.4 ns 21.4 ns 32667394 bytes_per_second=44.5464G/s
BM_MemoryBandwidth/256 47.6 ns 47.6 ns 14712204 bytes_per_second=40.0884G/s
BM_MemoryBandwidth/512 86.9 ns 86.9 ns 8057225 bytes_per_second=43.9169G/s
BM_MemoryBandwidth/1024 165 ns 165 ns 4233063 bytes_per_second=46.1437G/s
BM_MemoryBandwidth/2048 322 ns 322 ns 2173012 bytes_per_second=47.356G/s
BM_MemoryBandwidth/4096 636 ns 636 ns 1099074 bytes_per_second=47.9781G/s
BM_MemoryBandwidth/8192 1264 ns 1264 ns 553898 bytes_per_second=48.3047G/s
BM_MemoryBandwidth/16384 2524 ns 2524 ns 277224 bytes_per_second=48.3688G/s
BM_MemoryBandwidth/32768 5035 ns 5035 ns 138843 bytes_per_second=48.4882G/s
BM_MemoryBandwidth/65536 10058 ns 10058 ns 69578 bytes_per_second=48.5455G/s
BM_MemoryBandwidth/131072 20103 ns 20102 ns 34832 bytes_per_second=48.5802G/s
BM_MemoryBandwidth/262144 40185 ns 40185 ns 17420 bytes_per_second=48.6035G/s
BM_MemoryBandwidth/524288 80351 ns 80347 ns 8708 bytes_per_second=48.6171G/s
BM_MemoryBandwidth/1048576 160855 ns 160851 ns 4353 bytes_per_second=48.5699G/s
BM_MemoryBandwidth/2097152 321657 ns 321643 ns 2177 bytes_per_second=48.5787G/s
BM_MemoryBandwidth/4194304 648490 ns 648454 ns 1005 bytes_per_second=48.1915G/s
BM_MemoryBandwidth/8388608 1307549 ns 1307485 ns 502 bytes_per_second=47.8017G/s
BM_MemoryBandwidth_BigO 0.16 N 0.16 N
BM_MemoryBandwidth_RMS 1 % 1 %
What am I misunderstanding about memory bandwidth that has made my calculations come out wrong by more than a factor of 2?
(Also, this is kinda an insane workflow to empirically determine how much memory bandwidth I have. Is there a better way?)
Full asm for sum_array after removing add instructions:
_Z9sum_arrayPdl:
.LVL0:
.LFB3624:
.file 1 "example_code/memory_bandwidth.cpp"
.loc 1 5 1 view -0
.cfi_startproc
.loc 1 6 5 view .LVU1
.loc 1 7 5 view .LVU2
.LBB1545:
# example_code/memory_bandwidth.cpp:7: for (long i =0 ; i < n; ++i) {
.loc 1 7 24 is_stmt 0 view .LVU3
test rsi, rsi # n
jle .L7 #,
lea rax, -1[rsi] # tmp105,
cmp rax, 1 # tmp105,
jbe .L8 #,
mov rdx, rsi # bnd.299, n
shr rdx # bnd.299
sal rdx, 4 # tmp107,
mov rax, rdi # ivtmp.311, v
add rdx, rdi # _44, v
pxor xmm0, xmm0 # vect_s_10.306
.LVL1:
.p2align 4,,10
.p2align 3
.L5:
.loc 1 8 9 is_stmt 1 discriminator 2 view .LVU4
# example_code/memory_bandwidth.cpp:8: s += v[i];
.loc 1 8 11 is_stmt 0 discriminator 2 view .LVU5
movupd xmm2, XMMWORD PTR [rax] # tmp115, MEM[base: _24, offset: 0B]
add rax, 16 # ivtmp.311,
.loc 1 8 11 discriminator 2 view .LVU6
cmp rax, rdx # ivtmp.311, _44
jne .L5 #,
movapd xmm1, xmm0 # tmp110, vect_s_10.306
unpckhpd xmm1, xmm0 # tmp110, vect_s_10.306
mov rax, rsi # tmp.301, n
and rax, -2 # tmp.301,
test sil, 1 # n,
je .L10 #,
.L3:
.LVL2:
.loc 1 8 9 is_stmt 1 view .LVU7
# example_code/memory_bandwidth.cpp:8: s += v[i];
.loc 1 8 11 is_stmt 0 view .LVU8
addsd xmm0, QWORD PTR [rdi+rax*8] # <retval>, *_3
.LVL3:
# example_code/memory_bandwidth.cpp:7: for (long i =0 ; i < n; ++i) {
.loc 1 7 5 view .LVU9
inc rax # i
.LVL4:
# example_code/memory_bandwidth.cpp:7: for (long i =0 ; i < n; ++i) {
.loc 1 7 24 view .LVU10
cmp rsi, rax # n, i
jle .L1 #,
.loc 1 8 9 is_stmt 1 view .LVU11
# example_code/memory_bandwidth.cpp:8: s += v[i];
.loc 1 8 11 is_stmt 0 view .LVU12
addsd xmm0, QWORD PTR [rdi+rax*8] # <retval>, *_6
.LVL5:
.loc 1 8 11 view .LVU13
ret
.LVL6:
.p2align 4,,10
.p2align 3
.L7:
.loc 1 8 11 view .LVU14
.LBE1545:
# example_code/memory_bandwidth.cpp:6: double s = 0;
.loc 1 6 12 view .LVU15
pxor xmm0, xmm0 # <retval>
.loc 1 10 5 is_stmt 1 view .LVU16
.LVL7:
.L1:
# example_code/memory_bandwidth.cpp:11: }
.loc 1 11 1 is_stmt 0 view .LVU17
ret
.p2align 4,,10
.p2align 3
.L10:
.loc 1 11 1 view .LVU18
ret
.LVL8:
.L8:
.LBB1546:
# example_code/memory_bandwidth.cpp:7: for (long i =0 ; i < n; ++i) {
.loc 1 7 15 view .LVU19
xor eax, eax # tmp.301
.LBE1546:
# example_code/memory_bandwidth.cpp:6: double s = 0;
.loc 1 6 12 view .LVU20
pxor xmm0, xmm0 # <retval>
jmp .L3 #
.cfi_endproc
.LFE3624:
.size _Z9sum_arrayPdl, .-_Z9sum_arrayPdl
.section .text.startup,"ax",#progbits
.p2align 4
.globl main
.type main, #function
Full output of lshw -class memory:
*-firmware
description: BIOS
vendor: American Megatrends Inc.
physical id: 0
version: 1.90
date: 10/21/2016
size: 64KiB
capacity: 15MiB
capabilities: pci upgrade shadowing cdboot bootselect socketedrom edd int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int9keyboard int14serial int17printer acpi usb biosbootspecification uefi
*-memory
description: System Memory
physical id: 3c
slot: System board or motherboard
size: 16GiB
*-bank:0
description: [empty]
physical id: 0
slot: ChannelA-DIMM0
*-bank:1
description: DIMM DDR4 Synchronous 2400 MHz (0.4 ns)
product: CMU16GX4M2A2400C16
vendor: AMI
physical id: 1
serial: 00000000
slot: ChannelA-DIMM1
size: 8GiB
width: 64 bits
clock: 2400MHz (0.4ns)
*-bank:2
description: [empty]
physical id: 2
slot: ChannelB-DIMM0
*-bank:3
description: DIMM DDR4 Synchronous 2400 MHz (0.4 ns)
product: CMU16GX4M2A2400C16
vendor: AMI
physical id: 3
serial: 00000000
slot: ChannelB-DIMM1
size: 8GiB
width: 64 bits
clock: 2400MHz (0.4ns)
Is the CPU relevant here? Well here's the specs:
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 94
Model name: Intel(R) Pentium(R) CPU G4400 # 3.30GHz
Stepping: 3
CPU MHz: 3168.660
CPU max MHz: 3300.0000
CPU min MHz: 800.0000
BogoMIPS: 6624.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 3072K
NUMA node0 CPU(s): 0,1
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust erms invpcid rdseed smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm arat pln pts hwp hwp_notify hwp_act_window hwp_epp flush_l1d
The data produced by the clang compile is much more intelligible. The performance monotonically decreases until it hits 19.8Gb/s as the vector gets much larger than cache:
Here's the benchmark output:
It looks like from your hardware description that you have two DIMM slots that are placed into two channels. This interleaves memory between the two DIMM chips, so that memory accesses will be reading from both chips. (One possibility is that bytes 0-7 are in DIMM1 and bytes 8-15 are in DIMM2, but this depends on the hardware implementation.) This doubles the memory bandwidth because you're accessing two hardware chips instead of one.
Some systems support three or four channels, further increasing the maximum bandwidth.
I got runtime errors trying to execute the "Hello World!" program from Suave web page:
open Suave.Web
open Suave.Successful
[<EntryPoint>]
let main argv =
startWebServer defaultConfig (OK "Hello, Suave!")
0
With this errors:
Loaded assembly: /Users/[myUserName]/Library/Mobile
Documents/com~apple~CloudDocs/Projects/SuaveRestApi/SuaveRestApi/bin/Debug/SuaveRestApi.exe
Loaded assembly: /Users/[myUserName]/Library/Mobile
Documents/com~apple~CloudDocs/Projects/SuaveRestApi/SuaveRestApi/bin/Debug/FSharp.Core.dll
[External] Loaded assembly: /Users/[myUserName]/Library/Mobile
Documents/com~apple~CloudDocs/Projects/SuaveRestApi/SuaveRestApi/bin/Debug/Suave.dll
[External] Loaded assembly:
/Library/Frameworks/Mono.framework/Versions/5.18.0/lib/mono/gac/System/4.0.0.0__b77a5c561934e089/System.dll
[External] Loaded assembly:
/Library/Frameworks/Mono.framework/Versions/5.18.0/lib/mono/gac/Mono.Security/4.0.0.0__0738eb9f132ed756/Mono.Security.dll
[External] Method idx 6984 is greater than number of rows (6984) in
PPDB MethodDebugInformation table, for method
.$YoLo:.cctor in
'/Users/[myUserName]/Library/Mobile
Documents/com~apple~CloudDocs/Projects/SuaveRestApi/SuaveRestApi/bin/Debug/Suave.pdb'.
Likely a malformed PDB file. Stacktrace:
at <0xffffffff> at (wrapper managed-to-native)
object.__icall_wrapper_mono_generic_class_init (intptr) [0x00007] in
<699ddfd029d348d1a69fb1f9fc28d7cb>:0 at YoLo/UTF8.get_utf8 ()
[0x00000] in
/Users/ademar/Projects/release/suave/src/Suave/Utils/YoLo.fs:447 at
Suave.Successful.OK (string) [0x00000] in
/Users/ademar/Projects/release/suave/src/Suave/Combinators.fs:136 at
App.main (string[]) [0x00000] in /Users/[myUserName]/Library/Mobile
Documents/com~apple~CloudDocs/Projects/SuaveRestApi/SuaveRestApi/App.fs:6
at (wrapper runtime-invoke) .runtime_invoke_int_object
(object,intptr,intptr,intptr) [0x00057] in
<5c0abea33afd9276a7450383a3be0a5c>:0 Memory around native instruction
pointer (0x7fff71ccc23e): 0x7fff71ccc22e ff ff c3 90 90 90 b8 48 01
00 02 49 89 ca 0f 05 .......H...I.... 0x7fff71ccc23e 73 08 48 89 c7
e9 6f a1 ff ff c3 55 48 89 e5 41 s.H...o....UH..A 0x7fff71ccc24e 57
41 56 41 55 41 54 53 48 81 ec 88 00 00 00 49 WAVAUATSH......I
0x7fff71ccc25e 89 f4 48 8d 05 01 c3 cb 32 48 8b 00 48 8d 9d 60
..H.....2H..H..`
Native stacktrace:
0 mono 0x000000010ba29842
mono_dump_native_crash_info + 898 1 mono
0x000000010b9cd303 mono_handle_native_crash + 195 2
libsystem_platform.dylib 0x00007fff71d77b3d _sigtramp + 29
3 mono 0x000000010bcdf008
simple_lower_case_mapping_higharea_table0 + 20296 4
libsystem_c.dylib 0x00007fff71c351c9 abort + 127 5
mono 0x000000010bbf9c5f
mono_log_write_logfile + 351 6 mono
0x000000010bc15dac monoeg_g_log + 204 7 mono
0x000000010bac9e68 mono_ppdb_get_seq_points + 1112 8 mono
0x000000010b93492f mono_method_to_ir + 1151 9 mono
0x000000010b91ade7 mini_method_compile + 3479 10 mono
0x000000010b91e32c mono_jit_compile_method_inner + 76 11 mono
0x000000010b921cc9 mono_jit_compile_method_with_opt + 2665 12 mono
0x000000010b925d8e mono_jit_runtime_invoke + 526 13 mono
0x000000010bb241d4 do_runtime_invoke + 84 14 mono
0x000000010bb201a8 mono_runtime_class_init_full + 872 15 mono
0x000000010b97a69c mono_generic_class_init + 28 16 ???
0x000000010beb7e2c 0x0 + 4494949932 17 mono
0x000000010b9261d3 mono_jit_runtime_invoke + 1619 18 mono
0x000000010bb241d4 do_runtime_invoke + 84 19 mono
0x000000010bb27cac do_exec_main_checked + 156 20 mono
0x000000010b98d19d mono_jit_exec + 349 21 mono
0x000000010b99037f mono_main + 11615 22 mono
0x000000010b915078 main + 264 23 libdyld.dylib
0x00007fff71b8ced9 start + 1 Pkilling 0x70000ed57000 from 0x1115795c0
Pkilling 0x70000eb54000 from 0x1115795c0 Entering thread summarizer
pause from 0x1115795c0 Finished thread summarizer pause from
0x1115795c0.
Waiting for dumping threads to resume
Debug info from gdb:
(lldb) command source -s 0 '/tmp/mono-gdb-commands.UoI26p' Executing
commands in '/tmp/mono-gdb-commands.UoI26p'. (lldb) process attach
--pid 1457 Process 1457 stopped
* thread #1, name = 'tid_307', queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
frame #0: 0x00007fff71ccb2d2 libsystem_kernel.dylib__wait4 + 10 libsystem_kernel.dylib__wait4:
-> 0x7fff71ccb2d2 <+10>: jae 0x7fff71ccb2dc ; <+20>
0x7fff71ccb2d4 <+12>: movq %rax, %rdi
0x7fff71ccb2d7 <+15>: jmp 0x7fff71cc6381 ; cerror
0x7fff71ccb2dc <+20>: retq Target 0: (mono) stopped.
Executable module set to
"/Library/Frameworks/Mono.framework/Versions/5.18.0/bin/mono".
Architecture set to: x86_64h-apple-macosx. (lldb) thread list Process
1457 stopped
* thread #1: tid = 0xd6f5, 0x00007fff71ccb2d2 libsystem_kernel.dylib__wait4 + 10, name = 'tid_307', queue =
'com.apple.main-thread', stop reason = signal SIGSTOP thread #2: tid
= 0xd6f6, 0x00007fff71cc97de libsystem_kernel.dylib__psynch_cvwait + 10, name = 'SGen worker' thread #3: tid = 0xd6f8, 0x00007fff71cc61b6
libsystem_kernel.dylibsemaphore_wait_trap + 10, name = 'Finalizer'
thread #4: tid = 0xd6f9, 0x00007fff71cc93e6
libsystem_kernel.dylib__recvfrom + 10, name = 'Debugger agent'
thread #5: tid = 0xd700, 0x00007fff71cc7b6a
libsystem_kernel.dylib__workq_kernreturn + 10 thread #6: tid =
0xd701, 0x00007fff71cc7b6a libsystem_kernel.dylib__workq_kernreturn +
10 (lldb) thread backtrace all
* thread #1, name = 'tid_307', queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * frame #0: 0x00007fff71ccb2d2
libsystem_kernel.dylib__wait4 + 10
frame #1: 0x000000010ba299b6 monomono_dump_native_crash_info at mini-posix.c:1079 [opt]
frame #2: 0x000000010ba29862 monomono_dump_native_crash_info(signal="SIGABRT",
ctx=0x00007ffee42eb0f0, info=<unavailable>) at mini-posix.c:1102 [opt]
frame #3: 0x000000010b9cd303 monomono_handle_native_crash(signal="SIGABRT",
ctx=0x00007ffee42eb0f0, info=0x00007ffee42eb088) at
mini-exceptions.c:3197 [opt]
frame #4: 0x00007fff71d77b3d libsystem_platform.dylib_sigtramp + 29
frame #5: 0x00007fff71ccc23f libsystem_kernel.dylib__pthread_kill + 11
frame #6: 0x00007fff71d82c1c libsystem_pthread.dylibpthread_kill + 285
frame #7: 0x00007fff71c351c9 libsystem_c.dylibabort + 127
frame #8: 0x000000010bbf9c5f monomono_log_write_logfile(log_domain=<unavailable>,
level=<unavailable>, hdr=<unavailable>, message="Method idx 6984 is
greater than number of rows (6984) in PPDB MethodDebugInformation
table, for method <StartupCode$Suave>.$YoLo:.cctor in
'/Users/[myUserName]/Library/Mobile
Documents/com~apple~CloudDocs/Projects/SuaveRestApi/SuaveRestApi/bin/Debug/Suave.pdb'.
Likely a malformed PDB file.") at mono-log-common.c:135 [opt]
frame #9: 0x000000010bc15dac monomonoeg_g_log [inlined] monoeg_g_logstr(log_domain=0x0000000000000000,
log_level=G_LOG_LEVEL_ERROR, msg=) at goutput.c:117 [opt]
frame #10: 0x000000010bc15d83 monomonoeg_g_log [inlined] monoeg_g_logv_nofree(format=<unavailable>, args=<unavailable>) at
goutput.c:128 [opt]
frame #11: 0x000000010bc15d70 monomonoeg_g_log [inlined] monoeg_g_logv(format=, args=) at
goutput.c:135 [opt]
frame #12: 0x000000010bc15d70 monomonoeg_g_log(log_domain=0x0000000000000000,
log_level=G_LOG_LEVEL_ERROR, format=<unavailable>) at goutput.c:144
[opt]
frame #13: 0x000000010bac9e68 monomono_ppdb_get_seq_points(minfo=,
source_file=, source_file_list=,
source_files=, seq_points=0x00007ffee42eb730,
n_seq_points=) at debug-mono-ppdb.c:427 [opt]
frame #14: 0x000000010b93492f monomono_method_to_ir(cfg=<unavailable>, method=<unavailable>,
start_bblock=0x0000000000000000, end_bblock=0x0000000000000000,
return_var=0x0000000000000000, inline_args=0x0000000000000000,
inline_offset=<unavailable>, is_virtual_call=<unavailable>) at
method-to-ir.c:6674 [opt]
frame #15: 0x000000010b91ade7 monomini_method_compile(method=, opts=374434303,
domain=0x00007fbc91f02f00, flags=JIT_FLAG_RUN_CCTORS, parts=0,
aot_method_index=-1) at mini.c:3455 [opt]
frame #16: 0x000000010b91e32c monomono_jit_compile_method_inner(method=0x00007fbc94832ea8,
target_domain=0x00007fbc91f02f00, opt=<unavailable>,
error=0x00007ffee42ebd60) at mini.c:4043 [opt]
frame #17: 0x000000010b921cc9 monomono_jit_compile_method_with_opt(method=0x00007fbc94832ea8,
opt=, jit_only=1, error=0x00007ffee42ebd60) at
mini-runtime.c:2411 [opt]
frame #18: 0x000000010b925d8e monomono_jit_runtime_invoke [inlined] mono_jit_compile_method_jit_only(method=<unavailable>,
error=0x00007ffee42ebd60) at mini-runtime.c:2471 [opt]
frame #19: 0x000000010b925d6e monomono_jit_runtime_invoke(method=, obj=,
params=0x0000000000000000, exc=,
error=0x00007ffee42ebd60) at mini-runtime.c:3011 [opt]
frame #20: 0x000000010bb241d4 monodo_runtime_invoke(method=0x00007fbc94832ea8,
obj=0x0000000000000000, params=0x0000000000000000,
exc=0x00007ffee42ebd08, error=0x00007ffee42ebd60) at object.c:2960
[opt]
frame #21: 0x000000010bb201a8 monomono_runtime_class_init_full [inlined] mono_runtime_try_invoke(method=,
obj=, params=, error=0x00007ffee42ebd60) at
object.c:3069 [opt]
frame #22: 0x000000010bb20167 monomono_runtime_class_init_full(vtable=0x00007fbc9402ee60,
error=0x00007ffee42ebd60) at object.c:526 [opt]
frame #23: 0x000000010b97a69c monomono_generic_class_init(vtable=) at
jit-icalls.c:1511 [opt]
frame #24: 0x000000010beb7e2c
frame #25: 0x000000010b9261d3 monomono_jit_runtime_invoke(method=<unavailable>, obj=<unavailable>,
params=0x00007ffee42ec148, exc=0x00007fbc9402ee60,
error=<unavailable>) at mini-runtime.c:3148 [opt]
frame #26: 0x000000010bb241d4 monodo_runtime_invoke(method=0x00007fbc91f04798,
obj=0x0000000000000000, params=0x00007ffee42ec148,
exc=0x0000000000000000, error=0x00007ffee42ec180) at object.c:2960
[opt]
frame #27: 0x000000010bb27cac monodo_exec_main_checked [inlined] mono_runtime_invoke_checked(method=<unavailable>, obj=<unavailable>,
params=<unavailable>, error=0x00007ffee42ec180) at object.c:3128 [opt]
frame #28: 0x000000010bb27c9d monodo_exec_main_checked(method=0x00007fbc91f04798,
args=, error=0x00007ffee42ec180) at object.c:4949 [opt]
frame #29: 0x000000010b98d19d monomono_jit_exec at driver.g.c:1236 [opt]
frame #30: 0x000000010b98d18f monomono_jit_exec(domain=, assembly=,
argc=1, argv=0x00007ffee42ec550) at driver.g.c:1195 [opt]
frame #31: 0x000000010b99037f monomono_main [inlined] main_thread_handler at driver.g.c:1313 [opt]
frame #32: 0x000000010b990348 monomono_main(argc=4, argv=) at driver.g.c:2483 [opt]
frame #33: 0x000000010b915078 monomain [inlined] mono_main_with_options(argc=<unavailable>, argv=<unavailable>) at
main.c:50 [opt]
frame #34: 0x000000010b915064 monomain(argc=4, argv=) at main.c:405 [opt]
frame #35: 0x00007fff71b8ced9 libdyld.dylibstart + 1
frame #36: 0x00007fff71b8ced9 libdyld.dylibstart + 1 thread #2, name = 'SGen worker'
frame #0: 0x00007fff71cc97de libsystem_kernel.dylib__psynch_cvwait + 10
frame #1: 0x00007fff71d83593 libsystem_pthread.dylib_pthread_cond_wait + 724
frame #2: 0x000000010bbf10ee monothread_func [inlined] mono_os_cond_wait(mutex=<unavailable>) at mono-os-mutex.h:173 [opt]
frame #3: 0x000000010bbf10db monothread_func at sgen-thread-pool.c:165 [opt]
frame #4: 0x000000010bbf10cd monothread_func(data=0x0000000000000000) at sgen-thread-pool.c:196
[opt]
frame #5: 0x00007fff71d80305 libsystem_pthread.dylib_pthread_body + 126
frame #6: 0x00007fff71d8326f libsystem_pthread.dylib_pthread_start + 70
frame #7: 0x00007fff71d7f415 libsystem_pthread.dylibthread_start + 13 thread #3, name = 'Finalizer'
frame #0: 0x00007fff71cc61b6 libsystem_kernel.dylibsemaphore_wait_trap + 10
frame #1: 0x000000010bb9e9aa monofinalizer_thread [inlined] mono_os_sem_wait(flags=MONO_SEM_FLAGS_ALERTABLE) at
mono-os-semaphore.h:90 [opt]
frame #2: 0x000000010bb9e99f monofinalizer_thread at mono-coop-semaphore.h:43 [opt]
frame #3: 0x000000010bb9e985 monofinalizer_thread(unused=) at gc.c:918 [opt]
frame #4: 0x000000010bb49e13 monostart_wrapper [inlined] start_wrapper_internal at threads.c:1151 [opt]
frame #5: 0x000000010bb49d83 monostart_wrapper(data=0x00007fbc91f34850) at threads.c:1211 [opt]
frame #6: 0x00007fff71d80305 libsystem_pthread.dylib_pthread_body + 126
frame #7: 0x00007fff71d8326f libsystem_pthread.dylib_pthread_start + 70
frame #8: 0x00007fff71d7f415 libsystem_pthread.dylibthread_start + 13 thread #4, name = 'Debugger agent'
frame #0: 0x00007fff71cc93e6 libsystem_kernel.dylib__recvfrom + 10
frame #1: 0x000000010ba7e45e monosocket_transport_recv(buf=2001-01-01 00:00:00 UTC, len=11) at
debugger-agent.c:1108 [opt]
frame #2: 0x000000010ba6f296 monodebugger_thread [inlined] transport_recv(len=11) at debugger-agent.c:1518 [opt]
frame #3: 0x000000010ba6f27c monodebugger_thread(arg=<unavailable>) at debugger-agent.c:9484 [opt]
frame #4: 0x000000010bb49e13 monostart_wrapper [inlined] start_wrapper_internal at threads.c:1151 [opt]
frame #5: 0x000000010bb49d83 monostart_wrapper(data=0x00007fbc91f389a0) at threads.c:1211 [opt]
frame #6: 0x00007fff71d80305 libsystem_pthread.dylib_pthread_body + 126
frame #7: 0x00007fff71d8326f libsystem_pthread.dylib_pthread_start + 70
frame #8: 0x00007fff71d7f415 libsystem_pthread.dylibthread_start + 13 thread #5
frame #0: 0x00007fff71cc7b6a libsystem_kernel.dylib__workq_kernreturn + 10
frame #1: 0x00007fff71d7f630 libsystem_pthread.dylib_pthread_wqthread + 446
frame #2: 0x00007fff71d7f405 libsystem_pthread.dylibstart_wqthread + 13 thread #6
frame #0: 0x00007fff71cc7b6a libsystem_kernel.dylib__workq_kernreturn + 10
frame #1: 0x00007fff71d7f630 libsystem_pthread.dylib_pthread_wqthread + 446
frame #2: 0x00007fff71d7f405 libsystem_pthread.dylibstart_wqthread + 13 (lldb) detach
================================================================= Got a SIGABRT while executing native code. This usually indicates a fatal
error in the mono runtime or one of the native libraries used by your
application.
(lldb) quit Process 1457 detached
It seems an issue with Visual Studio for Mac and Suave. I did the same test in Visual Studio Code (for mac also) and worked fine.
I have a simple test code that setup both -XX:MetaspaceSize and -XX:MaxMetaspaceSize to a same value. I think the metaspace then should not dynamically resizing. But from my testing (check Metaspace diagram from VisualVM GC and print out log by jstat), i saw metaspace keep growing from a low value to the max value i set. So doesn't the -XX:MetaspaceSize not working?
My testing code:
try {
while(true){
Enhancer enhancer = new Enhancer();
enhancer.setSuperclass(A.class);
enhancer.setUseCache(false);
enhancer.setCallback((MethodInterceptor) (obj, method, args1, methodProxy) -> methodProxy.invokeSuper(obj, args1));
enhancer.create();
Thread.sleep(50);
}
} catch (Throwable throwable) {
throwable.printStackTrace();
}
VM args:
-XX:MetaspaceSize=10m -XX:MaxMetaspaceSize=10m
Java version:
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
jstat result: (MC value keep growing to around 10m)
C:\Users\dyu>jstat -gc 12336 1000 20
S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT
8192.0 8192.0 0.0 2207.8 49152.0 43399.2 131072.0 16.0 7168.0 6777.9 768.0 677.7 1 0.015 0 0.000 0.015
8192.0 8192.0 0.0 2207.8 49152.0 48166.1 131072.0 16.0 7168.0 6777.9 768.0 677.7 1 0.015 0 0.000 0.015
8192.0 8192.0 2592.0 0.0 49152.0 3691.9 131072.0 24.0 8832.0 8403.6 896.0 795.9 2 0.020 0 0.000 0.020
8192.0 8192.0 2592.0 0.0 49152.0 7537.9 131072.0 24.0 8832.0 8403.6 896.0 795.9 2 0.020 0 0.000 0.020
8192.0 8192.0 2592.0 0.0 49152.0 11378.9 131072.0 24.0 8832.0 8403.6 896.0 795.9 2 0.020 0 0.000 0.020
8192.0 8192.0 2592.0 0.0 49152.0 16180.3 131072.0 24.0 8832.0 8403.6 896.0 795.9 2 0.020 0 0.000 0.020
8192.0 8192.0 2592.0 0.0 49152.0 20021.3 131072.0 24.0 8832.0 8403.6 896.0 795.9 2 0.020 0 0.000 0.020
8192.0 8192.0 2592.0 0.0 49152.0 24822.5 131072.0 24.0 8832.0 8403.6 896.0 795.9 2 0.020 0 0.000 0.020
8192.0 8192.0 2592.0 0.0 49152.0 28663.5 131072.0 24.0 8832.0 8403.6 896.0 795.9 2 0.020 0 0.000 0.020
8192.0 8192.0 2592.0 0.0 49152.0 33466.8 131072.0 24.0 8832.0 8403.6 896.0 795.9 2 0.020 0 0.000 0.020
8192.0 8192.0 2592.0 0.0 49152.0 37312.8 131072.0 24.0 8832.0 8403.6 896.0 795.9 2 0.020 0 0.000 0.020
8192.0 8192.0 2592.0 0.0 49152.0 42114.1 131072.0 24.0 8832.0 8403.6 896.0 795.9 2 0.020 0 0.000 0.020
8192.0 8192.0 2592.0 0.0 49152.0 45955.1 131072.0 24.0 8832.0 8403.6 896.0 795.9 2 0.020 0 0.000 0.020
8192.0 8192.0 0.0 3488.0 49152.0 1925.1 131072.0 32.0 10752.0 10225.1 1024.0 934.2 3 0.026 0 0.000 0.026
8192.0 8192.0 0.0 3488.0 49152.0 6737.6 131072.0 32.0 10752.0 10225.1 1024.0 934.2 3 0.026 0 0.000 0.026
8192.0 8192.0 0.0 3488.0 49152.0 11758.5 131072.0 32.0 10752.0 10225.1 1024.0 934.2 3 0.026 0 0.000 0.026
8192.0 8192.0 0.0 3488.0 49152.0 15608.7 131072.0 32.0 10752.0 10225.1 1024.0 934.2 3 0.026 0 0.000 0.026
8192.0 8192.0 0.0 3488.0 49152.0 29056.4 131072.0 32.0 10752.0 10225.1 1024.0 934.2 3 0.026 0 0.000 0.026
8192.0 8192.0 0.0 3488.0 49152.0 34196.6 131072.0 32.0 10752.0 10225.1 1024.0 934.2 3 0.026 0 0.000 0.026
8192.0 8192.0 0.0 3488.0 49152.0 39339.7 131072.0 32.0 10752.0 10225.1 1024.0 934.2 3 0.026 0 0.000 0.026
You have misunderstood what -XX:MetaspaceSize does:
-XX:MetaspaceSize=size
Sets the size of the allocated class metadata space that will trigger a garbage collection the first time it is exceeded. This threshold for a garbage collection is increased or decreased depending on the amount of metadata used.
The name of that option might be considered misleading, unless you interpret it as “intended size”, so obviously it should trigger garbage collection when exceeded (while max size defines the hard limit).
There is an open bug report, JDK-8067205 calling for an option to set the initial metaspace size.
I run 25000 clients that just upload log to server every 1 second. The Server crashes in the process.From the log file, we found that the cause of the crash was the JVM crash.The Error log show :
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f032964f085, pid=2043, tid=0x00007f02955cd700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_111-b14) (build 1.8.0_111-b14)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.111-b14 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V [libjvm.so+0x5c4085] G1ParScanThreadState::copy_to_survivor_space(InCSetState, oopDesc*, markOopDesc*)+0x45
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /tmp/hs_err_pid2043.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
My JVM Arguments and System and more infos :
VM Arguments:
jvm_args: -Xms256M -Xmx16G -XX:+UseG1GC -Dfile.encoding=UTF8 -Dserver_log_dir=/var/log/kaa -Dserver_log_sufix= -Dserver_home_dir=/usr/lib/kaa-node -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=7091 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
java_command: org.kaaproject.kaa.server.node.KaaNodeApplication
java_class_path (initial): /usr/lib/kaa-node/conf:/usr/lib/kaa-node/lib/spring-context-4.2.5.RELEASE.jar:/usr/lib/kaa-node/lib/curator-client-2.9.0.jar:/usr/lib/kaa-node/lib/spring-jdbc-4.2.5.RELEASE.jar:/usr/lib/kaa-node/lib/javax.annotation-api-1.2.jar:/usr/lib/kaa-node/lib/log4j-over-slf4j-1.7.7.jar:/usr/lib/kaa-node/lib/jna-4.0.0.jar:/usr/lib/kaa-node/lib/jetty-server-9.2.2.v20140723.jar:/usr/lib/kaa-node/lib/fastutil-6.5.7.jar:/usr/lib/kaa-node/lib/application-action-0.0.64.jar:/usr/lib/kaa-node/lib/commons-collections-3.2.1.jar:/usr/lib/kaa-node/lib/joda-time-2.2.jar:/usr/lib/kaa-node/lib/httpcore-4.3.2.jar:/usr/lib/kaa-node/lib/velocity-1.7.jar:/usr/lib/kaa-node/lib/spring-tx-4.2.5.RELEASE.jar:/usr/lib/kaa-node/lib/hibernate-entitymanager-4.3.11.Final.jar:/usr/lib/kaa-node/lib/jetty-http-9.2.2.v20140723.jar:/usr/lib/kaa-node/lib/commons-cli-1.2.jar:/usr/lib/kaa-node/lib/gwt-client-0.2.1.jar:/usr/lib/kaa-node/lib/swagger-annotations-1.5.9.jar:/usr/lib/kaa-node/lib/jandex-1.1.0.Final.jar:/usr/lib/kaa-node/lib/core-0.10.0.jar:/usr/lib/kaa-node/lib/jetty-security-9.2.2.v20140723.jar:/usr/lib/kaa-node/lib/commons-compress-1.8.jar:/usr/lib/kaa-node/lib/jackson-core-asl-1.9.13.jar:/usr/lib/kaa-node/lib/netty-codec-4.0.34.Final.jar:/usr/lib/kaa-node/lib/dao-0.10.0.jar:/usr/lib/kaa-node/lib/file-appender-0.10.0.jar:/usr/lib/kaa-node/lib/spring-security-web-3.2.9.RELEASE.jar:/usr/lib/kaa-node/lib/gwtquery-1.4.2.jar:/usr/lib/kaa-node/lib/facebook-verifier-0.10.0.jar:/usr/lib/kaa-node/lib/transport-0.10.0-tcp.jar:/usr/lib/kaa-node/lib/spring-data-mongodb-1.9.4.RELEASE.jar:/usr/lib/kaa-node/lib/cassandra-driver-extras-3.0.0.jar:/usr/lib/kaa-node/lib/cassandra-all-3.4.jar:/usr/lib/kaa-node/lib/jcl-over-slf4j-1.7.21.jar:/usr/lib/kaa-node/lib/ant-launcher-1.9.4.jar:/usr/lib/kaa-node/lib/hamcrest-core-1.3.jar:/usr/lib/kaa-node/lib/aspectjrt-1.7.4.jar:/usr/lib/kaa-node/lib/guava-18.0.jar:/usr/lib/kaa-node/lib/spring-beans-4.2.5.RELEASE.jar:/usr/lib/kaa-node/lib/kaa-node-0.10
Launcher Type: SUN_STANDARD
Environment Variables:
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
SHELL=/bin/bash
SYSTEM:
OS:DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.5 LTS"
uname:Linux 4.4.0-31-generic #50~14.04.1-Ubuntu SMP Wed Jul 13 01:07:32 UTC 2016 x86_64
libc:glibc 2.19 NPTL 2.19
rlimit: STACK 8192k, CORE 0k, NPROC 32768, NOFILE 65536, AS infinity
load average:23.81 21.57 24.39
/proc/meminfo:
MemTotal: 32629180 kB
MemFree: 11245384 kB
MemAvailable: 16204112 kB
Buffers: 116504 kB
Cached: 5084432 kB
SwapCached: 0 kB
Active: 9205152 kB
Inactive: 3234744 kB
Active(anon): 7260192 kB
Inactive(anon): 1048 kB
Active(file): 1944960 kB
Inactive(file): 3233696 kB
Unevictable: 8523068 kB
Mlocked: 8523068 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 3096 kB
Writeback: 0 kB
AnonPages: 15762160 kB
Mapped: 168560 kB
Shmem: 1384 kB
Slab: 280612 kB
SReclaimable: 181816 kB
SUnreclaim: 98796 kB
KernelStack: 17856 kB
PageTables: 36468 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 16314588 kB
Committed_AS: 18242600 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 15398912 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 70872 kB
DirectMap2M: 2754560 kB
DirectMap1G: 30408704 kB
CPU:total 8 (4 cores per cpu, 2 threads per core) family 6 model 60 stepping 3, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, avx2, aes, clmul, erms, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2
/proc/cpuinfo:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3800.109
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3860.859
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 1
cpu cores : 4
apicid : 2
initial apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3799.968
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 2
cpu cores : 4
apicid : 4
initial apicid : 4
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3799.968
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 3
cpu cores : 4
apicid : 6
initial apicid : 6
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 4
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3893.906
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 5
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3800.109
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 1
cpu cores : 4
apicid : 3
initial apicid : 3
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 6
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3799.968
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 2
cpu cores : 4
apicid : 5
initial apicid : 5
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3799.968
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 3
cpu cores : 4
apicid : 7
initial apicid : 7
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
Memory: 4k page, physical 32629180k(11245384k free), swap 0k(0k free)
vm_info: Java HotSpot(TM) 64-Bit Server VM (25.111-b14) for linux-amd64 JRE (1.8.0_111-b14), built on Sep 22 2016 16:14:03 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8)
time: Fri Nov 25 05:01:22 2016
elapsed time: 21112 seconds (0d 5h 51m 52s)
In my limited experience with JVM. So I am searching for a long time on net and find related errors at Oracle site.But I didn't find a solution from it. From my error log:
Memory: 4k page, physical 32629180k(11245384k free), swap 0k(0k free)
show the physical memory occupied too much.This can be caused by any bug that corrupts heap memory. It could be an issue with GC, with the compiler, with bad native code.
If you don't use any native libraries that might have corrupted the heap, this a a bug in the JVM. You should check whether Oracle already knows about the bug and (if not) file a bug report.
The name of the problematic frame (G1ParScanThreadState::copy_to_survivor_space) strongly suggest that the garbage collector (GC) crashes. So for a workaround until the bug is fixed you can try any of the following:
Monitor the garbage collector and make sure the memory usage doesn't increase over time and the garbage collector doesn't use too much CPU time
Change the garbage collector parameters (see Java's command line
parameters)
Switch to a different garbage collector (see Java's command line parameters)
As you're trying to work around a bug, it's trial and error.
I'm trying to measure TLB misses in my laptop with the following configuration:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 61
model name : Intel(R) Core(TM) i5-5200U CPU # 2.20GHz
stepping : 4
microcode : 0x1d
cpu MHz : 1593.625
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 20
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap xsaveopt
bugs :
bogomips : 4389.43
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
The above is for processor 0 , with similar results for processor 1,2 and 3.
Here's my result for trying to measure TLB misses:
perf stat -B -e dTLB-load-misses sleep 2
Performance counter stats for 'sleep 2':
0 dTLB-load-misses
2.001923304 seconds time elapsed
Not sure how to interpret this. Any insights? I read somewhere that perf doesn't work well on Sandy Bridge laptops...