We are using Titan-1.0.0 supported by backed as cassandra 2.1.7. The cluster consists of 36 VMs each of 16GB RAM (6GB heapsize) and 16 CPU cores. We are using SSD disk for cassandra data and a normal HDD for commitlogs. The RF is 3 and reads/writes are being done at CF=2 each. The Java version is 1.8.0_45 There is heavy reads being served through this cluster. We are facing random outages with some servers at an average interval of around twice-thrice in day one of the server goes down. Let me share a particular instance of around 10-May-2017 04:00 AM, there were too many READ/MUTATION drops and we had to restart the server. Following are some of parameters that we have collected when the servers go down:
Can someone please please help us with some pointers, as to what is going wrong here and what can be done to stabilize the servers.
gc.log
2017-05-11T21:09:44.018+0530: 1900952.543: Total time for which application threads were stopped: 0.0156063 seconds, Stopping threads took: 0.0125466 seconds 2017-05-11T21:09:45.021+0530: 1900953.546: Total time for which application threads were stopped: 0.0031342 seconds, Stopping threads took: 0.0006697 seconds 2017-05-11T21:09:46.025+0530: 1900954.550: Total time for which application threads were stopped: 0.0037100 seconds, Stopping threads took: 0.0011525 seconds 2017-05-11T21:09:47.031+0530: 1900955.556: Total time for which application threads were stopped: 0.0057972 seconds, Stopping threads took: 0.0030861 seconds 2017-05-11T21:09:48.034+0530: 1900956.559: Total time for which application threads were stopped: 0.0029592 seconds, Stopping threads took: 0.0007129 seconds 2017-05-11T21:09:49.061+0530: 1900957.586: Total time for which application threads were stopped: 0.0268731 seconds, Stopping threads took: 0.0242117 seconds 2017-05-11T21:10:03.095+0530: 1900971.620: Total time for which application threads were stopped: 0.0045643 seconds, Stopping threads took: 0.0009709 seconds 2017-05-11T21:10:04.099+0530: 1900972.624: Total time for which application threads were stopped: 0.0033764 seconds, Stopping threads took: 0.0007673 seconds 2017-05-11T21:10:07.104+0530: 1900975.629: Total time for which application threads were stopped: 0.0056445 seconds, Stopping threads took: 0.0019495 seconds 2017-05-11T21:10:07.447+0530: 1900975.972: Total time for which application threads were stopped: 0.0045721 seconds, Stopping threads took: 0.0009695 seconds {Heap before GC invocations=85258 (full 0): garbage-first heap total 6291456K, used 5059586K [0x0000000640000000, 0x0000000640206000, 0x00000007c0000000) region size 2048K, 1844 young (3776512K), 105 survivors (215040K) Metaspace used 36253K, capacity 36654K, committed 36784K, reserved 1081344K class space used 3844K, capacity 3965K, committed 4016K, reserved 1048576K 2017-05-11T21:10:07.453+0530:
1900975.978: [GC pause (GCLocker Initiated GC) (young) Desired survivor size 242221056 bytes, new threshold 15 (max 15)
- age 1: 182659896 bytes, 182659896 total
- age 2: 21418160 bytes, 204078056 total
- age 3: 871016 bytes, 204949072 total
- age 4: 3600512 bytes, 208549584 total
- age 5: 1313096 bytes, 209862680 total
- age 6: 21152 bytes, 209883832 total , 0.0972867 secs] [Parallel Time: 83.6 ms, GC Workers: 13]
[GC Worker Start (ms): Min: 1900975980.9, Avg: 1900975986.0, Max: 1900975989.4, Diff: 8.5]
[Ext Root Scanning (ms): Min: 8.5, Avg: 11.9, Max: 17.4, Diff: 8.9, Sum: 154.3]
[Update RS (ms): Min: 26.8, Avg: 30.8, Max: 34.2, Diff: 7.4, Sum: 399.8]
[Processed Buffers: Min: 31, Avg: 87.3, Max: 144, Diff: 113, Sum: 1135]
[Scan RS (ms): Min: 0.1, Avg: 2.8, Max: 6.2, Diff: 6.1, Sum: 36.6]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.3, Sum: 0.9]
[Object Copy (ms): Min: 31.5, Avg: 32.3, Max: 33.2, Diff: 1.7, Sum: 419.9]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.3, Sum: 3.1]
[GC Worker Total (ms): Min: 74.7, Avg: 78.1, Max: 83.3, Diff: 8.7, Sum: 1014.7]
[GC Worker End (ms): Min: 1900976063.9, Avg: 1900976064.1, Max: 1900976064.3, Diff: 0.3] [Code Root Fixup: 0.1 ms] [Code Root Purge: 0.0 ms] [Clear CT: 1.4 ms] [Other: 12.1 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 1.0 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 1.6 ms]
[Humongous Reclaim: 0.4 ms]
[Free CSet: 4.5 ms] [Eden: 3478.0M(3476.0M)->0.0B(252.0M) Survivors: 210.0M->54.0M Heap: 4941.5M(6144.0M)->1175.0M(6144.0M)] Heap after GC invocations=85259 (full 0): garbage-first heap total 6291456K, used 1203201K [0x0000000640000000, 0x0000000640206000, 0x00000007c0000000) region size 2048K, 27 young (55296K), 27 survivors (55296K) Metaspace used 36253K, capacity 36654K, committed 36784K, reserved 1081344K class space used 3844K, capacity 3965K, committed 4016K, reserved 1048576K } [Times: user=0.58 sys=0.21, real=0.09 secs] 2017-05-11T21:10:07.551+0530:
1900976.076: Total time for which application threads were stopped: 0.1015506 seconds, Stopping threads took: 0.0004470 seconds 2017-05-11T21:10:08.557+0530: 1900977.082: Total time for which application threads were stopped: 0.0053236 seconds, Stopping threads took: 0.0010397 seconds {Heap before GC invocations=85259 (full 0): garbage-first heap total 6291456K, used 1461249K [0x0000000640000000, 0x0000000640206000, 0x00000007c0000000) region size 2048K, 153 young (313344K), 27 survivors (55296K) Metaspace used 36253K, capacity 36654K, committed 36784K, reserved 1081344K class space used 3844K, capacity 3965K, committed 4016K, reserved 1048576K 2017-05-11T21:10:08.995+0530: 1900977.520: [GC pause (G1 Evacuation Pause) (mixed) Desired survivor size 20971520 bytes, new threshold 3 (max 15)
- age 1: 19769440 bytes, 19769440 total
- age 2: 801568 bytes, 20571008 total
- age 3: 20679216 bytes, 41250224 total
- age 4: 738736 bytes, 41988960 total
- age 5: 3536096 bytes, 45525056 total
- age 6: 1242648 bytes, 46767704 total
- age 7: 19208 bytes, 46786912 total , 0.1879873 secs] [Parallel Time: 175.3 ms, GC Workers: 13]
[GC Worker Start (ms): Min: 1900977524.7, Avg: 1900977525.0, Max: 1900977525.3, Diff: 0.6]
[Ext Root Scanning (ms): Min: 14.5, Avg: 14.8, Max: 15.3, Diff: 0.8, Sum: 192.9]
[Update RS (ms): Min: 7.1, Avg: 7.9, Max: 13.3, Diff: 6.1, Sum: 103.2]
[Processed Buffers: Min: 4, Avg: 37.0, Max: 93, Diff: 89, Sum: 481]
[Scan RS (ms): Min: 55.7, Avg: 60.9, Max: 61.8, Diff: 6.1, Sum: 791.9]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 0.7, Diff: 0.7, Sum: 0.9]
[Object Copy (ms): Min: 90.3, Avg: 90.7, Max: 91.2, Diff: 0.9, Sum: 1179.5]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum: 0.8]
[GC Worker Total (ms): Min: 174.3, Avg: 174.6, Max: 174.8, Diff: 0.6, Sum: 2269.2]
[GC Worker End (ms): Min: 1900977699.6, Avg: 1900977699.6, Max: 1900977699.7, Diff: 0.1] [Code Root Fixup: 0.2 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.9 ms] [Other: 11.5 ms]
[Choose CSet: 3.3 ms]
[Ref Proc: 0.4 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 1.1 ms]
[Humongous Reclaim: 0.1 ms]
[Free CSet: 3.6 ms] [Eden: 252.0M(252.0M)->0.0B(3656.0M) Survivors: 54.0M->30.0M Heap: 1427.0M(6144.0M)->890.0M(6144.0M)] Heap after GC invocations=85260 (full 0): garbage-first heap total 6291456K, used 911310K [0x0000000640000000, 0x0000000640206000, 0x00000007c0000000) region size 2048K, 15 young (30720K), 15 survivors (30720K) Metaspace used 36253K, capacity 36654K, committed 36784K, reserved 1081344K class space used 3844K, capacity 3965K, committed 4016K, reserved 1048576K } [Times: user=1.59 sys=0.60, real=0.19 secs]
jstat
S0 S1 E O M CCS YGC YGCT FGC FGCT GCT
0.00 100.00 2.29 37.85 98.56 95.74 79645 6351.963 0 0.000 6351.963
0.00 100.00 4.63 37.85 98.56 95.74 79645 6351.963 0 0.000 6351.963
0.00 100.00 7.81 37.85 98.56 95.74 79645 6351.963 0 0.000 6351.963
0.00 100.00 10.46 37.85 98.56 95.74 79645 6351.963 0 0.000 6351.963
0.00 100.00 13.17 37.81 98.56 95.74 79645 6351.963 0 0.000 6351.963
0.00 100.00 15.30 38.11 98.56 95.74 79645 6351.963 0 0.000 6351.963
/var/log/cassandra/system.log
WARN [SharedPool-Worker-1] 2017-05-11 21:08:49,621 SliceQueryFilter.java:319 - Read 286 live and 12978 tombstone cells in icmsgraph_10042017.edgestore for key: f800000000104600 (see tombstone_warn_threshold). 2147483647 columns were requested, slices=[b02ea000800000350080000002-b02ea000800000350080000003]
WARN [SharedPool-Worker-1] 2017-05-11 21:08:52,647 SliceQueryFilter.java:319 - Read 286 live and 12978 tombstone cells in icmsgraph_10042017.edgestore for key: f800000000104600 (see tombstone_warn_threshold). 2147483647 columns were requested, slices=[b02ea000800000350080000002-b02ea000800000350080000003]
WARN [GossipTasks:1] 2017-05-11 21:09:05,030 Gossiper.java:714 - Gossip stage has 1 pending tasks; skipping status check (no nodes will be marked down)
WARN [GossipTasks:1] 2017-05-11 21:09:16,588 FailureDetector.java:249 - Not marking nodes down due to local pause of 16886451559 > 5000000000
INFO [ScheduledTasks:1] 2017-05-11 21:09:16,954 MessagingService.java:888 - 1924 READ messages dropped in last 5000ms
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,153 StatusLogger.java:51 - Pool Name Active Pending Completed Blocked All Time Blocked
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,154 StatusLogger.java:66 - MutationStage 0 0 32568360 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,155 StatusLogger.java:66 - RequestResponseStage 1 0 12112153129 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,155 StatusLogger.java:66 - ReadRepairStage 0 0 593141839 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,155 StatusLogger.java:66 - CounterMutationStage 0 0 0 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,155 StatusLogger.java:66 - ReadStage 15 0 12168491480 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,155 StatusLogger.java:66 - MiscStage 0 0 0 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,156 StatusLogger.java:66 - HintedHandoff 0 1 3531 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,156 StatusLogger.java:66 - GossipStage 0 0 5798815 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,156 StatusLogger.java:66 - CacheCleanupExecutor 0 0 0 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,156 StatusLogger.java:66 - InternalResponseStage 0 0 537 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,156 StatusLogger.java:66 - CommitLogArchiver 0 0 0 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,157 StatusLogger.java:66 - CompactionExecutor 0 0 29915673 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,157 StatusLogger.java:66 - ValidationExecutor 0 0 0 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,157 StatusLogger.java:66 - MigrationStage 0 0 3990 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,157 StatusLogger.java:66 - AntiEntropyStage 0 0 0 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,158 StatusLogger.java:66 - PendingRangeCalculator 0 0 255 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,159 StatusLogger.java:66 - Sampler 0 0 0 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,159 StatusLogger.java:66 - MemtableFlushWriter 0 0 7705 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,159 StatusLogger.java:66 - MemtablePostFlush 0 0 44884 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,159 StatusLogger.java:66 - MemtableReclaimMemory 0 0 7705 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,160 StatusLogger.java:75 - CompactionManager 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,160 StatusLogger.java:87 - MessagingService n/a 57/3
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,160 StatusLogger.java:97 - Cache Type Size Capacity KeysToSave
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,160 StatusLogger.java:99 - KeyCache 104857584 104857600 all
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,160 StatusLogger.java:105 - RowCache 0 0 all
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,160 StatusLogger.java:112 - ColumnFamily Memtable ops,data
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,162 StatusLogger.java:115 - system.compaction_history 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,162 StatusLogger.java:115 - system.hints 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,162 StatusLogger.java:115 - system.IndexInfo 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,162 StatusLogger.java:115 - system.schema_columnfamilies 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,162 StatusLogger.java:115 - system.schema_triggers 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,162 StatusLogger.java:115 - system.size_estimates 762300,93404798
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.paxos 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.peer_events 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.range_xfers 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.compactions_in_progress 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.peers 283,103428
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.schema_keyspaces 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.schema_usertypes 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.local 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.sstable_activity 315,63532
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.schema_columns 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,163 StatusLogger.java:115 - system.batchlog 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.titan_ids 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.graphindex 1843,277464
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.system_properties_lock_ 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.system_properties 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.edgestore_lock_ 6,1191
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.systemlog 741,242777
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.txlog 741,242777
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.graphindex_lock_ 1332,277240
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_10042017.edgestore 75171,16624309
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_05052017.system_properties_lock_ 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,169 StatusLogger.java:115 - icmsgraph_05052017.system_properties 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,170 StatusLogger.java:115 - icmsgraph_05052017.edgestore_lock_ 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,170 StatusLogger.java:115 - icmsgraph_05052017.txlog 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,170 StatusLogger.java:115 - icmsgraph_05052017.edgestore 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,170 StatusLogger.java:115 - icmsgraph_05052017.graphindex 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,170 StatusLogger.java:115 - icmsgraph_05052017.titan_ids 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,170 StatusLogger.java:115 - icmsgraph_05052017.systemlog 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:17,170 StatusLogger.java:115 - icmsgraph_05052017.graphindex_lock_ 0,0
WARN [SharedPool-Worker-24] 2017-05-11 21:09:17,200 SliceQueryFilter.java:319 - Read 286 live and 12978 tombstone cells in icmsgraph_10042017.edgestore for key: f800000000104600 (see tombstone_warn_threshold). 2147483647 columns were requested, slices=[b02ea000800000350080000002-b02ea000800000350080000003]
WARN [SharedPool-Worker-18] 2017-05-11 21:09:17,201 SliceQueryFilter.java:319 - Read 286 live and 12978 tombstone cells in icmsgraph_10042017.edgestore for key: f800000000104600 (see tombstone_warn_threshold). 2147483647 columns were requested, slices=[b02ea000800000350080000002-b02ea000800000350080000003]
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,170 MessagingService.java:888 - 1 READ messages dropped in last 5000ms
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,170 StatusLogger.java:51 - Pool Name Active Pending Completed Blocked All Time Blocked
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,171 StatusLogger.java:66 - MutationStage 0 0 32568376 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,171 StatusLogger.java:66 - RequestResponseStage 0 0 12112193109 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,171 StatusLogger.java:66 - ReadRepairStage 0 0 593143878 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,171 StatusLogger.java:66 - CounterMutationStage 0 0 0 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,171 StatusLogger.java:66 - ReadStage 0 0 12168508581 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,171 StatusLogger.java:66 - MiscStage 0 0 0 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - HintedHandoff 0 1 3531 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - GossipStage 0 0 5798844 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - CacheCleanupExecutor 0 0 0 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - InternalResponseStage 0 0 537 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - CommitLogArchiver 0 0 0 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - CompactionExecutor 0 0 29915673 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - ValidationExecutor 0 0 0 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - MigrationStage 0 0 3990 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - AntiEntropyStage 0 0 0 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - PendingRangeCalculator 0 0 255 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,172 StatusLogger.java:66 - Sampler 0 0 0 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:66 - MemtableFlushWriter 0 0 7705 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:66 - MemtablePostFlush 0 0 44884 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:66 - MemtableReclaimMemory 0 0 7705 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:75 - CompactionManager 0 0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:87 - MessagingService n/a 0/0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:97 - Cache Type Size Capacity KeysToSave
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:99 - KeyCache 104857584 104857600 all
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,174 StatusLogger.java:105 - RowCache 0 0 all
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,173 StatusLogger.java:99 - KeyCache 104857584 104857600 all
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,174 StatusLogger.java:105 - RowCache 0 0 all
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,174 StatusLogger.java:112 - ColumnFamily Memtable ops,data
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.titan_ids 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.graphindex 1844,277804
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.system_properties_lock_ 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.system_properties 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.edgestore_lock_ 6,1191
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.systemlog 741,242777
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.txlog 741,242777
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.graphindex_lock_ 1332,277240
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,179 StatusLogger.java:115 - icmsgraph_10042017.edgestore 75176,16625645
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.system_properties_lock_ 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.system_properties 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.edgestore_lock_ 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.txlog 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.edgestore 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.graphindex 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.titan_ids 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.systemlog 0,0
INFO [ScheduledTasks:1] 2017-05-11 21:09:22,180 StatusLogger.java:115 - icmsgraph_05052017.graphindex_lock_ 0,0
WARN [SharedPool-Worker-34] 2017-05-11 21:09:24,177 SliceQueryFilter.java:319 - Read 286 live and 12978 tombstone cells in icmsgraph_10042017.edgestore for key: f800000000104600 (see tombstone_warn_threshold). 2147483647 columns were requested, slices=[b02ea000800000350080000002-b02ea000800000350080000003]
ThreadDump
Around 600 Thrift threads:
"Thrift:20118" #124110 daemon prio=5 os_prio=0 tid=0x00007feab9a71460 nid=0xb1c6 runnable [0x00007fea6a578000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:170)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked (a java.io.BufferedInputStream)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:205)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
72 MessagingService-Incoming threads ...
"MessagingService-Incoming-/192.168.33.67" #124106 prio=5 os_prio=0 tid=0x00007feab98436b0 nid=0xb0ca runnable [0x00007fea7503a000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:197) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
- locked (a java.lang.Object) at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:192)
- locked (a java.lang.Object) at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
- locked (a sun.nio.ch.SocketAdaptor$SocketInputStream) at net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:215) at net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:149) at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:101) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:169) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:88)
sar -q
09:10:01 IST 5 1396 16.02 9.24 6.31 0
I think the problem is cassandra's tombstone
In Cassandra whenever a delete, insert explicitly null or TTL expire Tombstone created. Tombstones are a mechanism which allows Cassandra to write fast but it has an operational price to pay.
First of all, tombstones are themselves records. They take up space and can substantially increase the amount of storage you require.
Secondly, Large Number of Tombstones Causes Latency and Heap Pressure
From your cassandra's system log it seems that about 13K tombstone is created only for a single vertex (Vertex Key ID : 0xf800000000104600). It's huge. You can see that when this warning appear on cassandra's system.log, Heap Pressure increases and so gc.log printed at the same time.
What should i do ?
Cassandra will fully drop those tombstones when a compaction triggers, only after local_delete_time + gc_grace_seconds as defined on the table the data belongs to.
Run nodetool repair regularly, Once a week.
You can force a major compaction on one or more table with nodetool compact
Change the table compaction class. Which Compaction class to use
Read More :
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
https://opencredo.com/cassandra-tombstones-common-issues/
I run 25000 clients that just upload log to server every 1 second. The Server crashes in the process.From the log file, we found that the cause of the crash was the JVM crash.The Error log show :
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f032964f085, pid=2043, tid=0x00007f02955cd700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_111-b14) (build 1.8.0_111-b14)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.111-b14 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V [libjvm.so+0x5c4085] G1ParScanThreadState::copy_to_survivor_space(InCSetState, oopDesc*, markOopDesc*)+0x45
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /tmp/hs_err_pid2043.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
My JVM Arguments and System and more infos :
VM Arguments:
jvm_args: -Xms256M -Xmx16G -XX:+UseG1GC -Dfile.encoding=UTF8 -Dserver_log_dir=/var/log/kaa -Dserver_log_sufix= -Dserver_home_dir=/usr/lib/kaa-node -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=7091 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
java_command: org.kaaproject.kaa.server.node.KaaNodeApplication
java_class_path (initial): /usr/lib/kaa-node/conf:/usr/lib/kaa-node/lib/spring-context-4.2.5.RELEASE.jar:/usr/lib/kaa-node/lib/curator-client-2.9.0.jar:/usr/lib/kaa-node/lib/spring-jdbc-4.2.5.RELEASE.jar:/usr/lib/kaa-node/lib/javax.annotation-api-1.2.jar:/usr/lib/kaa-node/lib/log4j-over-slf4j-1.7.7.jar:/usr/lib/kaa-node/lib/jna-4.0.0.jar:/usr/lib/kaa-node/lib/jetty-server-9.2.2.v20140723.jar:/usr/lib/kaa-node/lib/fastutil-6.5.7.jar:/usr/lib/kaa-node/lib/application-action-0.0.64.jar:/usr/lib/kaa-node/lib/commons-collections-3.2.1.jar:/usr/lib/kaa-node/lib/joda-time-2.2.jar:/usr/lib/kaa-node/lib/httpcore-4.3.2.jar:/usr/lib/kaa-node/lib/velocity-1.7.jar:/usr/lib/kaa-node/lib/spring-tx-4.2.5.RELEASE.jar:/usr/lib/kaa-node/lib/hibernate-entitymanager-4.3.11.Final.jar:/usr/lib/kaa-node/lib/jetty-http-9.2.2.v20140723.jar:/usr/lib/kaa-node/lib/commons-cli-1.2.jar:/usr/lib/kaa-node/lib/gwt-client-0.2.1.jar:/usr/lib/kaa-node/lib/swagger-annotations-1.5.9.jar:/usr/lib/kaa-node/lib/jandex-1.1.0.Final.jar:/usr/lib/kaa-node/lib/core-0.10.0.jar:/usr/lib/kaa-node/lib/jetty-security-9.2.2.v20140723.jar:/usr/lib/kaa-node/lib/commons-compress-1.8.jar:/usr/lib/kaa-node/lib/jackson-core-asl-1.9.13.jar:/usr/lib/kaa-node/lib/netty-codec-4.0.34.Final.jar:/usr/lib/kaa-node/lib/dao-0.10.0.jar:/usr/lib/kaa-node/lib/file-appender-0.10.0.jar:/usr/lib/kaa-node/lib/spring-security-web-3.2.9.RELEASE.jar:/usr/lib/kaa-node/lib/gwtquery-1.4.2.jar:/usr/lib/kaa-node/lib/facebook-verifier-0.10.0.jar:/usr/lib/kaa-node/lib/transport-0.10.0-tcp.jar:/usr/lib/kaa-node/lib/spring-data-mongodb-1.9.4.RELEASE.jar:/usr/lib/kaa-node/lib/cassandra-driver-extras-3.0.0.jar:/usr/lib/kaa-node/lib/cassandra-all-3.4.jar:/usr/lib/kaa-node/lib/jcl-over-slf4j-1.7.21.jar:/usr/lib/kaa-node/lib/ant-launcher-1.9.4.jar:/usr/lib/kaa-node/lib/hamcrest-core-1.3.jar:/usr/lib/kaa-node/lib/aspectjrt-1.7.4.jar:/usr/lib/kaa-node/lib/guava-18.0.jar:/usr/lib/kaa-node/lib/spring-beans-4.2.5.RELEASE.jar:/usr/lib/kaa-node/lib/kaa-node-0.10
Launcher Type: SUN_STANDARD
Environment Variables:
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
SHELL=/bin/bash
SYSTEM:
OS:DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.5 LTS"
uname:Linux 4.4.0-31-generic #50~14.04.1-Ubuntu SMP Wed Jul 13 01:07:32 UTC 2016 x86_64
libc:glibc 2.19 NPTL 2.19
rlimit: STACK 8192k, CORE 0k, NPROC 32768, NOFILE 65536, AS infinity
load average:23.81 21.57 24.39
/proc/meminfo:
MemTotal: 32629180 kB
MemFree: 11245384 kB
MemAvailable: 16204112 kB
Buffers: 116504 kB
Cached: 5084432 kB
SwapCached: 0 kB
Active: 9205152 kB
Inactive: 3234744 kB
Active(anon): 7260192 kB
Inactive(anon): 1048 kB
Active(file): 1944960 kB
Inactive(file): 3233696 kB
Unevictable: 8523068 kB
Mlocked: 8523068 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 3096 kB
Writeback: 0 kB
AnonPages: 15762160 kB
Mapped: 168560 kB
Shmem: 1384 kB
Slab: 280612 kB
SReclaimable: 181816 kB
SUnreclaim: 98796 kB
KernelStack: 17856 kB
PageTables: 36468 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 16314588 kB
Committed_AS: 18242600 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 15398912 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 70872 kB
DirectMap2M: 2754560 kB
DirectMap1G: 30408704 kB
CPU:total 8 (4 cores per cpu, 2 threads per core) family 6 model 60 stepping 3, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, avx2, aes, clmul, erms, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2
/proc/cpuinfo:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3800.109
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3860.859
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 1
cpu cores : 4
apicid : 2
initial apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3799.968
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 2
cpu cores : 4
apicid : 4
initial apicid : 4
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3799.968
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 3
cpu cores : 4
apicid : 6
initial apicid : 6
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 4
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3893.906
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 5
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3800.109
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 1
cpu cores : 4
apicid : 3
initial apicid : 3
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 6
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3799.968
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 2
cpu cores : 4
apicid : 5
initial apicid : 5
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4790 CPU # 3.60GHz
stepping : 3
microcode : 0x1d
cpu MHz : 3799.968
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 3
cpu cores : 4
apicid : 7
initial apicid : 7
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts
bugs :
bogomips : 7183.28
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
Memory: 4k page, physical 32629180k(11245384k free), swap 0k(0k free)
vm_info: Java HotSpot(TM) 64-Bit Server VM (25.111-b14) for linux-amd64 JRE (1.8.0_111-b14), built on Sep 22 2016 16:14:03 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8)
time: Fri Nov 25 05:01:22 2016
elapsed time: 21112 seconds (0d 5h 51m 52s)
In my limited experience with JVM. So I am searching for a long time on net and find related errors at Oracle site.But I didn't find a solution from it. From my error log:
Memory: 4k page, physical 32629180k(11245384k free), swap 0k(0k free)
show the physical memory occupied too much.This can be caused by any bug that corrupts heap memory. It could be an issue with GC, with the compiler, with bad native code.
If you don't use any native libraries that might have corrupted the heap, this a a bug in the JVM. You should check whether Oracle already knows about the bug and (if not) file a bug report.
The name of the problematic frame (G1ParScanThreadState::copy_to_survivor_space) strongly suggest that the garbage collector (GC) crashes. So for a workaround until the bug is fixed you can try any of the following:
Monitor the garbage collector and make sure the memory usage doesn't increase over time and the garbage collector doesn't use too much CPU time
Change the garbage collector parameters (see Java's command line
parameters)
Switch to a different garbage collector (see Java's command line parameters)
As you're trying to work around a bug, it's trial and error.