I am dealing with a custom implementation for wicket session store, data store, page store. I have cu cluster wicket and make it work in the following situation:
There are 2 nodes in the cluster, node one fails and the user should be able to continue the flow without noticing, the pages a statefull, with a lot of ajax requests. For now I'm storing the wicket session in a custom storage over rmi, and I'm trying to extend the DiskPageStore. The new challenge is SessionEntry inner class, it is still hold by a ConcurrentMap.
My question is: Has anyone done this before? Do you have any suggestions on how to accomplish this?
My suggestion is forget about DiskPageStore and SessionEntry in your situation. The ConcurrentMap you mentioned is held in the heap locally. Once one of the nodes fails, there is no way to get access to its ConcurrentMap, and Wicket resources referred to from the ConcurrentMap will be impossible to be released.
Therefore, in a clustered environment, you need to cluster the Wicket page store. Page versions can be expired based on certain policy, or deliberately removed when their corresponding session expires.
I've enabled web session and data store clustering for Apache Wicket used in an enterprise web application in production, and it has been working very well. The software I use are:
JDK 1.8.0_60
Apache Tomcat 8.0.33 (Tomcat 7 works too)
Wicket 6.16 (versions 6.22.0 and 7.2.0 should also work)
Apache Ignite 1.7.0
Load balancer: Crossroads
Ubuntu 14.04.1
The idea is use Apache Ignite for web session clustering, and it is pretty straightforward following its instructions for Web Session Clustering.
Once I got the web session clustered, I then put the data store (which includes the page store already) into the Ignite distributed data grid, while at the same time I disabled the Wicket application scoped cache (so as to make sure all data is clustered). Take a look at the documentation on Wicket's page store to find out how to configure the data store.
Alternatively you should be able to use Wicket HttpSessionDataStore to put the data store into the session. As the session is clustered, the data store is clustered automatically. But this approach does not work compatibly with Apache Ignite for me. So I use my own implementation of the IDataStore interface, which puts the data store into the Ignite distributed data grid. See below the implementation.
import java.util.concurrent.TimeUnit;
import javax.cache.expiry.Duration;
import javax.cache.expiry.TouchedExpiryPolicy;
import org.apache.ignite.Ignite;
import org.apache.ignite.IgniteCache;
import org.apache.ignite.Ignition;
import org.apache.ignite.cache.CacheMemoryMode;
import org.apache.ignite.cache.CacheMode;
import org.apache.ignite.cache.eviction.lru.LruEvictionPolicy;
import org.apache.ignite.configuration.CacheConfiguration;
import org.apache.wicket.pageStore.IDataStore;
import org.apache.wicket.pageStore.memory.IDataStoreEvictionStrategy;
import org.apache.wicket.pageStore.memory.PageTable;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class IgniteDataStore implements IDataStore {
private static final Logger log = LoggerFactory.getLogger(IgniteDataStore.class);
private final IDataStoreEvictionStrategy evictionStrategy;
private Ignite ignite;
IgniteCache<String, PageTable> igniteCache;
public IgniteDataStore(IDataStoreEvictionStrategy evictionStrategy) {
this.evictionStrategy = evictionStrategy;
CacheConfiguration<String, PageTable> cacheCfg = new CacheConfiguration<String, PageTable>("wicket-data-store");
cacheCfg.setCacheMode(CacheMode.PARTITIONED);
cacheCfg.setBackups(1);
cacheCfg.setMemoryMode(CacheMemoryMode.OFFHEAP_VALUES);
cacheCfg.setOffHeapMaxMemory(2 * 1024L * 1024L * 1024L); // 2 Gigabytes.
cacheCfg.setEvictionPolicy(new LruEvictionPolicy<String, PageTable>(10000));
cacheCfg.setExpiryPolicyFactory(TouchedExpiryPolicy.factoryOf(new Duration(TimeUnit.SECONDS, 14400)));
log.info("IgniteDataStore timeout is set to 14400 seconds.");
ignite = Ignition.ignite();
igniteCache = ignite.getOrCreateCache(cacheCfg);
}
#Override
public synchronized byte[] getData(String sessionId, int id) {
PageTable pageTable = getPageTable(sessionId, false);
byte[] pageAsBytes = null;
if (pageTable != null) {
pageAsBytes = pageTable.getPage(id);
}
return pageAsBytes;
}
#Override
public synchronized void removeData(String sessionId, int id) {
PageTable pageTable = getPageTable(sessionId, false);
if (pageTable != null) {
pageTable.removePage(id);
}
}
#Override
public synchronized void removeData(String sessionId) {
PageTable pageTable = getPageTable(sessionId, false);
if (pageTable != null) {
pageTable.clear();
}
igniteCache.remove(sessionId);
}
#Override
public synchronized void storeData(String sessionId, int id, byte[] data) {
PageTable pageTable = getPageTable(sessionId, true);
if (pageTable != null) {
pageTable.storePage(id, data);
evictionStrategy.evict(pageTable);
igniteCache.put(sessionId, pageTable);
} else {
log.error("Cannot store the data for page with id '{}' in session with id '{}'", id, sessionId);
}
}
#Override
public synchronized void destroy() {
igniteCache.clear();
}
#Override
public boolean isReplicated() {
return true;
}
#Override
public boolean canBeAsynchronous() {
return false;
}
private PageTable getPageTable(String sessionId, boolean create) {
if (igniteCache.containsKey(sessionId)) {
return igniteCache.get(sessionId);
}
if (!create) {
return null;
}
PageTable pageTable = new PageTable();
igniteCache.put(sessionId, pageTable);
return pageTable;
}
}
Hope it helps.
Related
I have tried to configure an existing Maven project to run using cucumber-junit-platform-engine.
I have used this repo as inspiration.
I added the Maven dependencies needed, as in the linked project using spring-boot-starter-parent version 2.4.5 and cucumber-jvm version 6.10.4.
I set the junit-platform properties as follows:
cucumber.execution.parallel.enabled=true
cucumber.execution.parallel.config.strategy=fixed
cucumber.execution.parallel.config.fixed.parallelism=4
Used annotation #Cucumber in the runner class and #SpringBootTest for classes with steps definition.
It seems to work fine with creating parallel threads, but the problem is it creates all the threads at the start and opens as many browser windows (drivers) as the number of scenarios (e.g. 51 instead of 4).
I am using a CucumberHooks class to add logic before and after scenarios and I'm guessing it interferes with the runner because of the annotations I'm using:
import java.util.List;
import org.openqa.selenium.OutputType;
import org.openqa.selenium.TakesScreenshot;
import org.openqa.selenium.WebDriver;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import io.cucumber.java.After;
import io.cucumber.java.Before;
import io.cucumber.java.Scenario;
import io.cucumber.plugin.ConcurrentEventListener;
import io.cucumber.plugin.event.EventHandler;
import io.cucumber.plugin.event.EventPublisher;
import io.cucumber.plugin.event.TestRunFinished;
import io.cucumber.plugin.event.TestRunStarted;
import io.github.bonigarcia.wdm.WebDriverManager;
public class CucumberHooks implements ConcurrentEventListener {
#Autowired
private ScenarioContext scenarioContext;
#Before
public void beforeScenario(Scenario scenario) {
scenarioContext.getNewDriverInstance();
scenarioContext.setScenario(scenario);
LOGGER.info("Driver initialized for scenario - {}", scenario.getName());
....
<some business logic here>
....
}
#After
public void afterScenario() {
Scenario scenario = scenarioContext.getScenario();
WebDriver driver = scenarioContext.getDriver();
takeErrorScreenshot(scenario, driver);
LOGGER.info("Driver will close for scenario - {}", scenario.getName());
driver.quit();
}
private void takeErrorScreenshot(Scenario scenario, WebDriver driver) {
if (scenario.isFailed()) {
final byte[] screenshot = ((TakesScreenshot) driver).getScreenshotAs(OutputType.BYTES);
scenario.attach(screenshot, "image/png", "Failure");
}
}
#Override
public void setEventPublisher(EventPublisher eventPublisher) {
eventPublisher.registerHandlerFor(TestRunStarted.class, beforeAll);
}
private EventHandler<TestRunStarted> beforeAll = event -> {
// something that needs doing before everything
.....<some business logic here>....
WebDriverManager.getInstance(DriverManagerType.CHROME).setup();
};
}
I tried replacing the #Before tag from io.cucumber.java with the #BeforeEach from org.junit.jupiter.api and it does not work.
How can I solve this issue?
New answer, JUnit 5 has been improved somewhat.
If you are on Java 9+ you can use the following in junit-platform.properties to enable a custom parallelism.
cucumber.execution.parallel.enabled=true
cucumber.execution.parallel.config.strategy=custom
cucumber.execution.parallel.config.custom.class=com.example.MyCustomParallelStrategy
And you'd implement MyCustomParallelStrategy as:
package com.example;
import org.junit.platform.engine.ConfigurationParameters;
import org.junit.platform.engine.support.hierarchical.ParallelExecutionConfiguration;
import org.junit.platform.engine.support.hierarchical.ParallelExecutionConfigurationStrategy;
import java.util.concurrent.ForkJoinPool;
import java.util.function.Predicate;
public class MyCustomParallelStrategy implements ParallelExecutionConfiguration, ParallelExecutionConfigurationStrategy {
private static final int FIXED_PARALLELISM = 4
#Override
public ParallelExecutionConfiguration createConfiguration(final ConfigurationParameters configurationParameters) {
return this;
}
#Override
public Predicate<? super ForkJoinPool> getSaturatePredicate() {
return (ForkJoinPool p) -> true;
}
#Override
public int getParallelism() {
return FIXED_PARALLELISM;
}
#Override
public int getMinimumRunnable() {
return FIXED_PARALLELISM;
}
#Override
public int getMaxPoolSize() {
return FIXED_PARALLELISM;
}
#Override
public int getCorePoolSize() {
return FIXED_PARALLELISM;
}
#Override
public int getKeepAliveSeconds() {
return 30;
}
On Java 9+ this will limit the max-pool size of the underlying forkjoin pool to FIXED_PARALLELISM and there should never be more then 8 web drivers active at the same time.
Also once JUnit5/#3044 is merged, released an integrated into Cucumber, you can use the cucumber.execution.parallel.config.fixed.max-pool-size on Java 9+ to limit the maximum number of concurrent tests.
So as it turns out parallism is mostly a suggestion. Cucumber uses JUnit5s ForkJoinPoolHierarchicalTestExecutorService which constructs a ForkJoinPool.
From the docs on ForkJoinPool:
For applications that require separate or custom pools, a ForkJoinPool may be constructed with a given target parallelism level; by default, equal to the number of available processors. The pool attempts to maintain enough active (or available) threads by dynamically adding, suspending, or resuming internal worker threads, even if some tasks are stalled waiting to join others. However, no such adjustments are guaranteed in the face of blocked I/O or other unmanaged synchronization.
So within a ForkJoinPool when ever a thread blocks for example because it starts asynchronous communication with the web driver another thread may be started to maintain the parallelism.
Since all threads wait, more threads are added to the pool and more web drivers are started.
This means that rather then relying on the ForkJoinPool to limit the number of webdrivers you have to do this yourself. You can use a library like Apache Commons Pool or implement a rudimentary pool using a counting semaphore.
#Component
#ScenarioScope
public class ScenarioContext {
private static final int MAX_CONCURRENT_WEB_DRIVERS = 1;
private static final Semaphore semaphore = new Semaphore(MAX_CONCURRENT_WEB_DRIVERS, true);
private WebDriver driver;
public WebDriver getDriver() {
if (driver != null) {
return driver;
}
try {
semaphore.acquire();
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
try {
driver = CustomChromeDriver.getInstance();
} catch (Throwable t){
semaphore.release();
throw t;
}
return driver;
}
public void retireDriver() {
if (driver == null) {
return;
}
try {
driver.quit();
} finally {
driver = null;
semaphore.release();
}
}
}
I need to process data from Rest web service. the following basic exemple is :
import org.springframework.batch.item.ItemReader;
import org.springframework.http.ResponseEntity;
import org.springframework.web.client.RestTemplate;
import java.util.Arrays;
import java.util.List;
class RESTDataReader implements ItemReader<DataDTO> {
private final String apiUrl;
private final RestTemplate restTemplate;
private int nextDataIndex;
private List<DataDTO> data;
RESTDataReader(String apiUrl, RestTemplate restTemplate) {
this.apiUrl = apiUrl;
this.restTemplate = restTemplate;
nextDataIndex = 0;
}
#Override
public DataDTO read() throws Exception {
if (dataIsNotInitialized()) {
data = fetchDataFromAPI();
}
DataDTO nextData = null;
if (nextDataIndex < data.size()) {
nextData = data.get(nextDataIndex);
nextDataIndex++;
}
else {
nextDataIndex= 0;
data = null;
}
return nextData;
}
private boolean dataIsNotInitialized() {
return this.data == null;
}
private List<DataDTO> fetchDataFromAPI() {
ResponseEntity<DataDTO[]> response = restTemplate.getForEntity(apiUrl,
DataDTO[].class
);
DataDTO[] data= response.getBody();
return Arrays.asList(data);
}
}
However, my fetchDataFromAPI method is called with time slots and it could get more than 20 Millions objects.
For example : if i call it between 01012020 and 01012021 i'll get 80 Millions data.
PS : the web service works by pagination of a single day, i.e. if I want to retrieve the data between 01/09/2020 and 07/09/2020 I have to call it several times (between 01/09-02/09 then between 02/09-03/09 and so on until 06/09-07/09)
My problem in this case is a heap space memory if the data is bulky.
I had to create a step for each month to avoid this problem in my BatchConfiguration (12 steps). The first step which will call the web service between 01/01/2020 and 01/02/2020 etc
Is there a solution to read all this volume of data with only one step before going to the processor ??
Thanks in advance
Since your web service does not provide pagination within a single day, you need to ensure that the process that calls this web service (ie your Spring Batch job) has enough memory to store all items returned by this service.
For example : if i call it between 01012020 and 01012021 i'll get 80 Millions data.
This means that if you call this web service with curl on a machine that does not have enough memory to hold the result, then the curl command will fail. The point I want to make here is that the only way to solve this issue is to give enough memory to the JVM that runs your Spring Batch job to hold such a big result set.
As a side note: if you have control over this web service, I highly recommend you to improve it by introducing a more granular pagination mechanism.
I have a spring cache requirement:
I need to request to a server to get some data and store the results in spring cache. The same request can give me different results every time so I decided to use #cachePut so that every time I can go inside my function and cache gets updated.
#CachePut(value="mycache", key="#url")
public String getData(String url){
try{
// get the data from server
// update the cache
// return data
} catch(){
// return data from cache
}
}
Now there is a twist. If the server is down and I am not able to get the response; I want my data from the cache (stored in previous requests).
If i use #Cacheable, I can't get the updated data. What is the clean way to do this? Something like catching an exception and return the data from cache.
you can get the cache implementation like this and handle the cache.
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.cache.Cache;
import org.springframework.cache.CacheManager;
import org.springframework.cache.annotation.CacheConfig;
import org.springframework.cache.annotation.CachePut;
import org.springframework.stereotype.Service;
#Service
#CacheConfig(cacheNames="mycache") // refer to cache/ehcache-xxxx.xml
public class CacheService {
#Autowired private CacheManager manager;
#CachePut(key="#url")
public String getData(String url) {
try {
//do something.
return null;
}catch(Exception e) {
Cache cache = manager.getCache("mycache");
return (String) cache.get(url).get();
}
}
}
So, I was looking at caching methods in Java (Spring). And Guava looked like it would solve the purpose.
This is the usecase -
I query for some data from a remote service. Kind of configuration field for my application. This field will be used by every inbound request to my application. And it would be expensive to call the remote service everytime as it's kind of constant which changes periodically.
So, on the first request inbound to my application, when I call remote service, I would cache the value. I set an expiry time of this cache as 30 mins. After 30 mins when the cache is expired and there is a request to retrieve the key, I would like a callback or something to do the operation of calling the remote service and setting the cache and return the value for that key.
How can I do it in Guava cache?
Here i give a example how to use guava cache. If you want to handle removal listener then need to call cleanUp. Here i run a thread which one call clean up every 30 minutes.
import com.google.common.cache.*;
import org.springframework.stereotype.Component;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
#Component
public class Cache {
public static LoadingCache<String, String> REQUIRED_CACHE;
public Cache(){
RemovalListener<String,String> REMOVAL_LISTENER = new RemovalListener<String, String>() {
#Override
public void onRemoval(RemovalNotification<String, String> notification) {
if(notification.getCause() == RemovalCause.EXPIRED){
//do as per your requirement
}
}
};
CacheLoader<String,String> LOADER = new CacheLoader<String, String>() {
#Override
public String load(String key) throws Exception {
return null; // return as per your requirement. if key value is not found
}
};
REQUIRED_CACHE = CacheBuilder.newBuilder().maximumSize(100000000)
.expireAfterWrite(30, TimeUnit.MINUTES)
.removalListener(REMOVAL_LISTENER)
.build(LOADER);
Executors.newSingleThreadExecutor().submit(()->{
while (true) {
REQUIRED_CACHE.cleanUp(); // need to call clean up for removal listener
TimeUnit.MINUTES.sleep(30L);
}
});
}
}
put & get data:
Cache.REQUIRED_CACHE.get("key");
Cache.REQUIRED_CACHE.put("key","value");
I have been working on a java application which crawls page from Internet with http-client(version4.3.3). It uses one fixedThreadPool with 5 threads,each is a loop thread .The pseudocode is following.
public class Spiderling extends Runnable{
#Override
public void run() {
while (true) {
T task = null;
try {
task = scheduler.poll();
if (task != null) {
if Ehcache contains task's config
taskConfig = Ehcache.getConfig;
else{
taskConfig = Query task config from db;//close the conn every time
put taskConfig into Ehcache
}
spider(task,taskConfig);
}
} catch (Exception e) {
e.printStackTrace();
}
}
LOG.error("spiderling is DEAD");
}
}
I am running it with following arguments -Duser.timezone=GMT+8 -server -Xms1536m -Xmx1536m -Xloggc:/home/datalord/logs/gc-2016-07-23-10-28-24.log -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintHeapAtGC on a server(2 cpus,2G memory) and it crashes pretty regular about once in two or three days with no OutOfMemoryError and no JVM error log.
Here is my analysis;
I analyse the gc log with GC-EASY,the report is here. The weird thing is the Old Gen increasing slowly until the allocated max heap size,but the Full Gc has never happened even once.
I suspect it might has memory leak,so I dump the heap map using cmd jmap -dump:format=b,file=soldier.bin and using the Eclipse MAT to analyze the dump file.Here is the problem suspect which object occupies 280+ M bytes.
The class "com.mysql.jdbc.NonRegisteringDriver",
loaded by "sun.misc.Launcher$AppClassLoader # 0xa0018490", occupies 281,118,144
(68.91%) bytes. The memory is accumulated in one instance of
"java.util.concurrent.ConcurrentHashMap$Segment[]" loaded by "".
Keywords
com.mysql.jdbc.NonRegisteringDriver
java.util.concurrent.ConcurrentHashMap$Segment[]
sun.misc.Launcher$AppClassLoader # 0xa0018490.
I use c3p0-0.9.1.2 as mysql connection pool and mysql-connector-java-5.1.34 as jdbc connector and Ehcache-2.6.10 as memory cache.I have see all posts about 'com.mysql.jdbc.NonregisteringDriver memory leak' and still get no clue.
This problem has driven me crazy for several days, any advice or help will be appreciated!
**********************Supplementary description on 07-24****************
I use a JAVA WEB + ORM Framework called JFinal(github.com/jfinal/jfinal) which is open in github。
Here are some core code for further description about the problem.
/**
* CacheKit. Useful tool box for EhCache.
*
*/
public class CacheKit {
private static CacheManager cacheManager;
private static final Logger log = Logger.getLogger(CacheKit.class);
static void init(CacheManager cacheManager) {
CacheKit.cacheManager = cacheManager;
}
public static CacheManager getCacheManager() {
return cacheManager;
}
static Cache getOrAddCache(String cacheName) {
Cache cache = cacheManager.getCache(cacheName);
if (cache == null) {
synchronized(cacheManager) {
cache = cacheManager.getCache(cacheName);
if (cache == null) {
log.warn("Could not find cache config [" + cacheName + "], using default.");
cacheManager.addCacheIfAbsent(cacheName);
cache = cacheManager.getCache(cacheName);
log.debug("Cache [" + cacheName + "] started.");
}
}
}
return cache;
}
public static void put(String cacheName, Object key, Object value) {
getOrAddCache(cacheName).put(new Element(key, value));
}
#SuppressWarnings("unchecked")
public static <T> T get(String cacheName, Object key) {
Element element = getOrAddCache(cacheName).get(key);
return element != null ? (T)element.getObjectValue() : null;
}
#SuppressWarnings("rawtypes")
public static List getKeys(String cacheName) {
return getOrAddCache(cacheName).getKeys();
}
public static void remove(String cacheName, Object key) {
getOrAddCache(cacheName).remove(key);
}
public static void removeAll(String cacheName) {
getOrAddCache(cacheName).removeAll();
}
#SuppressWarnings("unchecked")
public static <T> T get(String cacheName, Object key, IDataLoader dataLoader) {
Object data = get(cacheName, key);
if (data == null) {
data = dataLoader.load();
put(cacheName, key, data);
}
return (T)data;
}
#SuppressWarnings("unchecked")
public static <T> T get(String cacheName, Object key, Class<? extends IDataLoader> dataLoaderClass) {
Object data = get(cacheName, key);
if (data == null) {
try {
IDataLoader dataLoader = dataLoaderClass.newInstance();
data = dataLoader.load();
put(cacheName, key, data);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
return (T)data;
}
}
I use CacheKit like CacheKit.get("cfg_extract_rule_tree", extractRootId, new ExtractRuleTreeDataloader(extractRootId)). and class ExtractRuleTreeDataloader will be called if find nothing in cache by extractRootId.
public class ExtractRuleTreeDataloader implements IDataLoader {
public static final Logger LOG = LoggerFactory.getLogger(ExtractRuleTreeDataloader.class);
private int ruleTreeId;
public ExtractRuleTreeDataloader(int ruleTreeId) {
super();
this.ruleTreeId = ruleTreeId;
}
#Override
public Object load() {
List<Record> ruleTreeList = Db.find("SELECT * FROM cfg_extract_fule WHERE root_id=?", ruleTreeId);
TreeHelper<ExtractRuleNode> treeHelper = ExtractUtil.batchRecordConvertTree(ruleTreeList);//convert List<Record> to and tree
if (treeHelper.isValidTree()) {
return treeHelper.getRoot();
} else {
LOG.warn("rule tree id :{} is an error tree #end#", ruleTreeId);
return null;
}
}
As I said before, I use JFinal ORM.The Db.find method code is
public List<Record> find(String sql, Object... paras) {
Connection conn = null;
try {
conn = config.getConnection();
return find(config, conn, sql, paras);
} catch (Exception e) {
throw new ActiveRecordException(e);
} finally {
config.close(conn);
}
}
and the config close method code is
public final void close(Connection conn) {
if (threadLocal.get() == null) // in transaction if conn in threadlocal
if (conn != null)
try {conn.close();} catch (SQLException e) {throw new ActiveRecordException(e);}
}
There is no transaction in my code,so I am pretty sure the conn.close() will be called every time.
**********************more description on 07-28****************
First, I use Ehcache to store the taskConfigs in the memory. And the taskConfigs almost never change, so I want store them in the memory eternally and store them to disk if the memory can not store them all.
I use MAT to find out the GC Roots of NonRegisteringDriver, and the result is show in the following picture.
The Gc Roots of NonRegisteringDriver
But I still don't understand why the default behavior of Ehcache lead memory leak.The taskConfig is a class extends the Model class.
public class TaskConfig extends Model<TaskConfig> {
private static final long serialVersionUID = 5000070716569861947L;
public static TaskConfig DAO = new TaskConfig();
}
and the source code of Model is in this page(github.com/jfinal/jfinal/blob/jfinal-2.0/src/com/jfinal/plugin/activerecord/Model.java). And I can't find any reference (either directly or indirectly) to the connection object as #Jeremiah guessing.
Then I read the source code of NonRegisteringDriver, and don't understand why the map field connectionPhantomRefs of NonRegisteringDriver holds more than 5000 entrys of <ConnectionPhantomReference, ConnectionPhantomReference>,but find no ConnectionImpl in the queue field refQueue of NonRegisteringDriver. Because I see the cleanup code in class AbandonedConnectionCleanupThread which means it will move the ref in the NonRegisteringDriver.connectionPhantomRefs while getting abandoned connection ref from NonRegisteringDriver.refQueue.
#Override
public void run() {
threadRef = this;
while (running) {
try {
Reference<? extends ConnectionImpl> ref = NonRegisteringDriver.refQueue.remove(100);
if (ref != null) {
try {
((ConnectionPhantomReference) ref).cleanup();
} finally {
NonRegisteringDriver.connectionPhantomRefs.remove(ref);
}
}
} catch (Exception ex) {
// no where to really log this if we're static
}
}
}
Appreciate the help offered by #Jeremiah !
From the comments above I'm almost certain your memory leak is actually memory usage from EhCache. The ConcurrentHashMap you're seeing is the one backing the MemoryStore, and I'm guessing that the taskConfig holds a reference (either directly or indirectly) to the connection object, which is why it's showing in your stack.
Having eternal="true" in the default cache makes it so the inserted objects are never allowed to expire. Even without that, the timeToLive and timeToIdle values default to an infinite lifetime!
Combine that with the default behavior of Ehcache when retrieving elements is to copy them (last I checked), through serialization! You're just stacking new Object references up each time the taskConfig is extracted and put back into the ehcache.
The best way to test this (in my opinion) is to change your default cache configuration. Change eternal to false, and implement a timeToIdle value. timeToIdle is a time (in seconds) that a value may exist in the cache without being accessed.
<ehcache> <diskStore path="java.io.tmpdir"/> <defaultCache maxElementsInMemory="10000" eternal="false" timeToIdle="120" overflowToDisk="true" diskPersistent="false" diskExpiryThreadIntervalSeconds="120"/>
If that works, then you may want to look into further tweaking your ehcache configuration settings, or providing a more customized cache reference other than default for your class.
There are multiple performance considerations when tweaking the ehcache. I'm sure that there is a better configuration for your business model. The Ehcache documentation is good, but I found the site to be a bit scattered when I was trying to figure it out. I've listed some links that I found useful below.
http://www.ehcache.org/documentation/2.8/configuration/cache-size.html
http://www.ehcache.org/documentation/2.8/configuration/configuration.html
http://www.ehcache.org/documentation/2.8/apis/cache-eviction-algorithms.html#provided-memorystore-eviction-algorithms
Good luck!
To test your memory leak try the following:
Insert a TaskConfig into ehcache
Immediately retrieve it back out of the cache.
output the value of TaskConfig1.equals(TaskConfig2);
If it returns false, that is your memory leak. Override equals and
hash in your TaskConfig Object and rerun the test.
The root cause of the java program is that the Linux OS runs out of memory and the OOM Killer kills the progresses.
I found the log in /var/log/messages like following.
Aug 3 07:24:03 iZ233tupyzzZ kernel: Out of memory: Kill process 17308 (java) score 890 or sacrifice child
Aug 3 07:24:03 iZ233tupyzzZ kernel: Killed process 17308, UID 0, (java) total-vm:2925160kB, anon-rss:1764648kB, file-rss:248kB
Aug 3 07:24:03 iZ233tupyzzZ kernel: Thread (pooled) invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Aug 3 07:24:03 iZ233tupyzzZ kernel: Thread (pooled) cpuset=/ mems_allowed=0
Aug 3 07:24:03 iZ233tupyzzZ kernel: Pid: 6721, comm: Thread (pooled) Not tainted 2.6.32-431.23.3.el6.x86_64 #1
I also find the default value of maxIdleTime is 20 seconds in the C3p0Plugin which is a c3p0 plugin in JFinal, So I think this is why the Object NonRegisteringDriver occupies 280+ M bytes that shown in the MAT report. So I set the maxIdleTime to 3600 seconds and the object NonRegisteringDriver is no longer suspicious in the MAT report.
And I reset the jvm argements to -Xms512m -Xmx512m. And the java program already has been running pretty well for several days. The Full Gc will be called as expected when the Old Gen is full.