I have been working on a java application which crawls page from Internet with http-client(version4.3.3). It uses one fixedThreadPool with 5 threads,each is a loop thread .The pseudocode is following.
public class Spiderling extends Runnable{
#Override
public void run() {
while (true) {
T task = null;
try {
task = scheduler.poll();
if (task != null) {
if Ehcache contains task's config
taskConfig = Ehcache.getConfig;
else{
taskConfig = Query task config from db;//close the conn every time
put taskConfig into Ehcache
}
spider(task,taskConfig);
}
} catch (Exception e) {
e.printStackTrace();
}
}
LOG.error("spiderling is DEAD");
}
}
I am running it with following arguments -Duser.timezone=GMT+8 -server -Xms1536m -Xmx1536m -Xloggc:/home/datalord/logs/gc-2016-07-23-10-28-24.log -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintHeapAtGC on a server(2 cpus,2G memory) and it crashes pretty regular about once in two or three days with no OutOfMemoryError and no JVM error log.
Here is my analysis;
I analyse the gc log with GC-EASY,the report is here. The weird thing is the Old Gen increasing slowly until the allocated max heap size,but the Full Gc has never happened even once.
I suspect it might has memory leak,so I dump the heap map using cmd jmap -dump:format=b,file=soldier.bin and using the Eclipse MAT to analyze the dump file.Here is the problem suspect which object occupies 280+ M bytes.
The class "com.mysql.jdbc.NonRegisteringDriver",
loaded by "sun.misc.Launcher$AppClassLoader # 0xa0018490", occupies 281,118,144
(68.91%) bytes. The memory is accumulated in one instance of
"java.util.concurrent.ConcurrentHashMap$Segment[]" loaded by "".
Keywords
com.mysql.jdbc.NonRegisteringDriver
java.util.concurrent.ConcurrentHashMap$Segment[]
sun.misc.Launcher$AppClassLoader # 0xa0018490.
I use c3p0-0.9.1.2 as mysql connection pool and mysql-connector-java-5.1.34 as jdbc connector and Ehcache-2.6.10 as memory cache.I have see all posts about 'com.mysql.jdbc.NonregisteringDriver memory leak' and still get no clue.
This problem has driven me crazy for several days, any advice or help will be appreciated!
**********************Supplementary description on 07-24****************
I use a JAVA WEB + ORM Framework called JFinal(github.com/jfinal/jfinal) which is open in github。
Here are some core code for further description about the problem.
/**
* CacheKit. Useful tool box for EhCache.
*
*/
public class CacheKit {
private static CacheManager cacheManager;
private static final Logger log = Logger.getLogger(CacheKit.class);
static void init(CacheManager cacheManager) {
CacheKit.cacheManager = cacheManager;
}
public static CacheManager getCacheManager() {
return cacheManager;
}
static Cache getOrAddCache(String cacheName) {
Cache cache = cacheManager.getCache(cacheName);
if (cache == null) {
synchronized(cacheManager) {
cache = cacheManager.getCache(cacheName);
if (cache == null) {
log.warn("Could not find cache config [" + cacheName + "], using default.");
cacheManager.addCacheIfAbsent(cacheName);
cache = cacheManager.getCache(cacheName);
log.debug("Cache [" + cacheName + "] started.");
}
}
}
return cache;
}
public static void put(String cacheName, Object key, Object value) {
getOrAddCache(cacheName).put(new Element(key, value));
}
#SuppressWarnings("unchecked")
public static <T> T get(String cacheName, Object key) {
Element element = getOrAddCache(cacheName).get(key);
return element != null ? (T)element.getObjectValue() : null;
}
#SuppressWarnings("rawtypes")
public static List getKeys(String cacheName) {
return getOrAddCache(cacheName).getKeys();
}
public static void remove(String cacheName, Object key) {
getOrAddCache(cacheName).remove(key);
}
public static void removeAll(String cacheName) {
getOrAddCache(cacheName).removeAll();
}
#SuppressWarnings("unchecked")
public static <T> T get(String cacheName, Object key, IDataLoader dataLoader) {
Object data = get(cacheName, key);
if (data == null) {
data = dataLoader.load();
put(cacheName, key, data);
}
return (T)data;
}
#SuppressWarnings("unchecked")
public static <T> T get(String cacheName, Object key, Class<? extends IDataLoader> dataLoaderClass) {
Object data = get(cacheName, key);
if (data == null) {
try {
IDataLoader dataLoader = dataLoaderClass.newInstance();
data = dataLoader.load();
put(cacheName, key, data);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
return (T)data;
}
}
I use CacheKit like CacheKit.get("cfg_extract_rule_tree", extractRootId, new ExtractRuleTreeDataloader(extractRootId)). and class ExtractRuleTreeDataloader will be called if find nothing in cache by extractRootId.
public class ExtractRuleTreeDataloader implements IDataLoader {
public static final Logger LOG = LoggerFactory.getLogger(ExtractRuleTreeDataloader.class);
private int ruleTreeId;
public ExtractRuleTreeDataloader(int ruleTreeId) {
super();
this.ruleTreeId = ruleTreeId;
}
#Override
public Object load() {
List<Record> ruleTreeList = Db.find("SELECT * FROM cfg_extract_fule WHERE root_id=?", ruleTreeId);
TreeHelper<ExtractRuleNode> treeHelper = ExtractUtil.batchRecordConvertTree(ruleTreeList);//convert List<Record> to and tree
if (treeHelper.isValidTree()) {
return treeHelper.getRoot();
} else {
LOG.warn("rule tree id :{} is an error tree #end#", ruleTreeId);
return null;
}
}
As I said before, I use JFinal ORM.The Db.find method code is
public List<Record> find(String sql, Object... paras) {
Connection conn = null;
try {
conn = config.getConnection();
return find(config, conn, sql, paras);
} catch (Exception e) {
throw new ActiveRecordException(e);
} finally {
config.close(conn);
}
}
and the config close method code is
public final void close(Connection conn) {
if (threadLocal.get() == null) // in transaction if conn in threadlocal
if (conn != null)
try {conn.close();} catch (SQLException e) {throw new ActiveRecordException(e);}
}
There is no transaction in my code,so I am pretty sure the conn.close() will be called every time.
**********************more description on 07-28****************
First, I use Ehcache to store the taskConfigs in the memory. And the taskConfigs almost never change, so I want store them in the memory eternally and store them to disk if the memory can not store them all.
I use MAT to find out the GC Roots of NonRegisteringDriver, and the result is show in the following picture.
The Gc Roots of NonRegisteringDriver
But I still don't understand why the default behavior of Ehcache lead memory leak.The taskConfig is a class extends the Model class.
public class TaskConfig extends Model<TaskConfig> {
private static final long serialVersionUID = 5000070716569861947L;
public static TaskConfig DAO = new TaskConfig();
}
and the source code of Model is in this page(github.com/jfinal/jfinal/blob/jfinal-2.0/src/com/jfinal/plugin/activerecord/Model.java). And I can't find any reference (either directly or indirectly) to the connection object as #Jeremiah guessing.
Then I read the source code of NonRegisteringDriver, and don't understand why the map field connectionPhantomRefs of NonRegisteringDriver holds more than 5000 entrys of <ConnectionPhantomReference, ConnectionPhantomReference>,but find no ConnectionImpl in the queue field refQueue of NonRegisteringDriver. Because I see the cleanup code in class AbandonedConnectionCleanupThread which means it will move the ref in the NonRegisteringDriver.connectionPhantomRefs while getting abandoned connection ref from NonRegisteringDriver.refQueue.
#Override
public void run() {
threadRef = this;
while (running) {
try {
Reference<? extends ConnectionImpl> ref = NonRegisteringDriver.refQueue.remove(100);
if (ref != null) {
try {
((ConnectionPhantomReference) ref).cleanup();
} finally {
NonRegisteringDriver.connectionPhantomRefs.remove(ref);
}
}
} catch (Exception ex) {
// no where to really log this if we're static
}
}
}
Appreciate the help offered by #Jeremiah !
From the comments above I'm almost certain your memory leak is actually memory usage from EhCache. The ConcurrentHashMap you're seeing is the one backing the MemoryStore, and I'm guessing that the taskConfig holds a reference (either directly or indirectly) to the connection object, which is why it's showing in your stack.
Having eternal="true" in the default cache makes it so the inserted objects are never allowed to expire. Even without that, the timeToLive and timeToIdle values default to an infinite lifetime!
Combine that with the default behavior of Ehcache when retrieving elements is to copy them (last I checked), through serialization! You're just stacking new Object references up each time the taskConfig is extracted and put back into the ehcache.
The best way to test this (in my opinion) is to change your default cache configuration. Change eternal to false, and implement a timeToIdle value. timeToIdle is a time (in seconds) that a value may exist in the cache without being accessed.
<ehcache> <diskStore path="java.io.tmpdir"/> <defaultCache maxElementsInMemory="10000" eternal="false" timeToIdle="120" overflowToDisk="true" diskPersistent="false" diskExpiryThreadIntervalSeconds="120"/>
If that works, then you may want to look into further tweaking your ehcache configuration settings, or providing a more customized cache reference other than default for your class.
There are multiple performance considerations when tweaking the ehcache. I'm sure that there is a better configuration for your business model. The Ehcache documentation is good, but I found the site to be a bit scattered when I was trying to figure it out. I've listed some links that I found useful below.
http://www.ehcache.org/documentation/2.8/configuration/cache-size.html
http://www.ehcache.org/documentation/2.8/configuration/configuration.html
http://www.ehcache.org/documentation/2.8/apis/cache-eviction-algorithms.html#provided-memorystore-eviction-algorithms
Good luck!
To test your memory leak try the following:
Insert a TaskConfig into ehcache
Immediately retrieve it back out of the cache.
output the value of TaskConfig1.equals(TaskConfig2);
If it returns false, that is your memory leak. Override equals and
hash in your TaskConfig Object and rerun the test.
The root cause of the java program is that the Linux OS runs out of memory and the OOM Killer kills the progresses.
I found the log in /var/log/messages like following.
Aug 3 07:24:03 iZ233tupyzzZ kernel: Out of memory: Kill process 17308 (java) score 890 or sacrifice child
Aug 3 07:24:03 iZ233tupyzzZ kernel: Killed process 17308, UID 0, (java) total-vm:2925160kB, anon-rss:1764648kB, file-rss:248kB
Aug 3 07:24:03 iZ233tupyzzZ kernel: Thread (pooled) invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Aug 3 07:24:03 iZ233tupyzzZ kernel: Thread (pooled) cpuset=/ mems_allowed=0
Aug 3 07:24:03 iZ233tupyzzZ kernel: Pid: 6721, comm: Thread (pooled) Not tainted 2.6.32-431.23.3.el6.x86_64 #1
I also find the default value of maxIdleTime is 20 seconds in the C3p0Plugin which is a c3p0 plugin in JFinal, So I think this is why the Object NonRegisteringDriver occupies 280+ M bytes that shown in the MAT report. So I set the maxIdleTime to 3600 seconds and the object NonRegisteringDriver is no longer suspicious in the MAT report.
And I reset the jvm argements to -Xms512m -Xmx512m. And the java program already has been running pretty well for several days. The Full Gc will be called as expected when the Old Gen is full.
Related
I am trying to cache Kafka Records within 3 minutes of interval post that it will get expired and removed from the cache.
Each incoming records which is fetched using kafka consumer written in springboot needs to be updated in cache first then if it is present i need to discard the next duplicate records if it matches the cache record.
I have tried using Caffeine cache as below,
#EnableCaching
public class AppCacheManagerConfig {
#Bean
public CacheManager cacheManager(Ticker ticker) {
CaffeineCache bookCache = buildCache("declineRecords", ticker, 3);
SimpleCacheManager cacheManager = new SimpleCacheManager();
cacheManager.setCaches(Collections.singletonList(bookCache));
return cacheManager;
}
private CaffeineCache buildCache(String name, Ticker ticker, int minutesToExpire) {
return new CaffeineCache(name, Caffeine.newBuilder().expireAfterWrite(minutesToExpire, TimeUnit.MINUTES)
.maximumSize(100).ticker(ticker).build());
}
#Bean
public Ticker ticker() {
return Ticker.systemTicker();
}
}
and my Kafka Consumer is as below,
#Autowired
CachingServiceImpl cachingService;
#KafkaListener(topics = "#{'${spring.kafka.consumer.topic}'}", concurrency = "#{'${spring.kafka.consumer.concurrentConsumers}'}", errorHandler = "#{'${spring.kafka.consumer.errorHandler}'}")
public void consume(Message<?> message, Acknowledgment acknowledgment,
#Header(KafkaHeaders.RECEIVED_TIMESTAMP) long createTime) {
logger.info("Recieved Message: " + message.getPayload());
try {
boolean approveTopic = false;
boolean duplicateRecord = false;
if (cachingService.isDuplicateCheck(declineRecord)) {
//do something with records
}
else
{
//do something with records
}
cachingService.putInCache(xmlJSONObj, declineRecord, time);
and my caching service is as below,
#Component
public class CachingServiceImpl {
private static final Logger logger = LoggerFactory.getLogger(CachingServiceImpl.class);
#Autowired
CacheManager cacheManager;
#Cacheable(value = "declineRecords", key = "#declineRecord", sync = true)
public String putInCache(JSONObject xmlJSONObj, String declineRecord, String time) {
logger.info("Record is Cached for 3 minutes interval check", declineRecord);
cacheManager.getCache("declineRecords").put(declineRecord, time);
return declineRecord;
}
public boolean isDuplicateCheck(String declineRecord) {
if (null != cacheManager.getCache("declineRecords").get(declineRecord)) {
return true;
}
return false;
}
}
But Each time a record comes in consumer my cache is always empty. Its not holding the records.
Modifications Done:
I have added Configuration file as below after going through the suggestions and more kind of R&D removed some of the earlier logic and now the caching is working as expected but duplicate check is failing when all the three consumers are sending the same records.
`
#Configuration
public class AppCacheManagerConfig {
public static Cache<String, Object> jsonCache =
Caffeine.newBuilder().expireAfterWrite(3, TimeUnit.MINUTES)
.maximumSize(10000).recordStats().build();
#Bean
public CacheLoader<Object, Object> cacheLoader() {
CacheLoader<Object, Object> cacheLoader = new CacheLoader<Object, Object>() {
#Override
public Object load(Object key) throws Exception {
return null;
}
#Override
public Object reload(Object key, Object oldValue) throws Exception {
return oldValue;
}
};
return cacheLoader;
}
`
Now i am using the above cache as manual put and get.
I guess you're trying to implement records deduplication for Kafka.
Here is the similar discussion:
https://github.com/spring-projects/spring-kafka/issues/80
Here is the current abstract class which you may extend to achieve the necessary result:
https://github.com/spring-projects/spring-kafka/blob/master/spring-kafka/src/main/java/org/springframework/kafka/listener/adapter/AbstractFilteringMessageListener.java
Your caching service is definitely incorrect: Cacheable annotation allows marking the data getters and setters, to add caching through AOP. While in the code you clearly implement some low-level cache updating logic of your own.
At least next possible changes may help you:
Remove #Cacheable. You don't need it because you work with cache manually, so it may be the source of conflicts (especially as soon as you use sync = true). If it helps, remove #EnableCaching as well - it enables support for cache-related Spring annotations which you don't need here.
Try removing Ticker bean with the appropriate parameters for other beans. It should not be harmful as per your configuration, but usually it's helpful only for tests, no need to define it otherwise.
Double-check what is declineRecord. If it's a serialized object, ensure that serialization works properly.
Add recordStats() for cache and output stats() to log for further analysis.
I'm implementing an IBackingMap for my Trident topology to store tuples to ElasticSearch (I know there are several implementations for Trident/ElasticSearch integration already existing at GitHub however I've decided to implement a custom one which suits my task better).
So my implementation is a classic one with a factory:
public class ElasticSearchBackingMap implements IBackingMap<OpaqueValue<BatchAggregationResult>> {
// omitting here some other cool stuff...
private final Client client;
public static StateFactory getFactoryFor(final String host, final int port, final String clusterName) {
return new StateFactory() {
#Override
public State makeState(Map conf, IMetricsContext metrics, int partitionIndex, int numPartitions) {
ElasticSearchBackingMap esbm = new ElasticSearchBackingMap(host, port, clusterName);
CachedMap cm = new CachedMap(esbm, LOCAL_CACHE_SIZE);
MapState ms = OpaqueMap.build(cm);
return new SnapshottableMap(ms, new Values(GLOBAL_KEY));
}
};
}
public ElasticSearchBackingMap(String host, int port, String clusterName) {
Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", clusterName).build();
// TODO add a possibility to close the client
client = new TransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress(host, port));
}
// the actual implementation is left out
}
You see it gets host/port/cluster name as input params and creates an ElasticSearch client as a member of the class BUT IT NEVER CLOSES THE CLIENT.
It is then used from within a topology in a pretty familiar way:
tridentTopology.newStream("spout", spout)
// ...some processing steps here...
.groupBy(aggregationFields)
.persistentAggregate(
ElasticSearchBackingMap.getFactoryFor(
ElasticSearchConfig.ES_HOST,
ElasticSearchConfig.ES_PORT,
ElasticSearchConfig.ES_CLUSTER_NAME
),
new Fields(FieldNames.OUTCOME),
new BatchAggregator(),
new Fields(FieldNames.AGGREGATED));
This topology is wrapped into some public static void main, packed in a jar and sent to Storm for execution.
The question is, should I worry about closing the ElasticSearch connection or it is Storm's own business? If it is not done by Storm, how and when in the topology's lifecycle I should do that?
Thanks in advance!
Okay, answering my own question.
First of all, thanks again #dedek for suggestions and reviving the ticket in Storm's Jira.
Finally, since there's no official way to do that, I've decided to go for cleanup() method of Trident's Filter. So far I've verified the following (for Storm v. 0.9.4):
With LocalCluster
cleanup() gets called on cluster's shutdown
cleanup() DOESN'T get called when killing the topology, this shouldn't be a tragedy, very likely one won't use LocalCluster for real deployments anyway
With a real cluster
it gets called when the topology is killed as well as when the worker is stopped using pkill -TERM -u storm -f 'backtype.storm.daemon.worker'
it doesn't get called if the worker is killed with kill -9 or when it crashes or - sadly - when the worker dies due to an exception
In overall that gives more or less decent guarantee of cleanup() to get called, provided you'll be careful with exception handling (I tend to add 'thundercatches' to every of my Trident primitives anyway).
My code:
public class CloseFilter implements Filter {
private static final Logger LOG = LoggerFactory.getLogger(CloseFilter.class);
private final Closeable[] closeables;
public CloseFilter(Closeable... closeables) {
this.closeables = closeables;
}
#Override
public boolean isKeep(TridentTuple tuple) {
return true;
}
#Override
public void prepare(Map conf, TridentOperationContext context) {
}
#Override
public void cleanup() {
for (Closeable c : closeables) {
try {
c.close();
} catch (Exception e) {
LOG.warn("Failed to close an instance of {}", c.getClass(), e);
}
}
}
}
However would be nice if some day hooks for closing connections become a part of the API.
Code:
public static class Oya {
String name;
public Oya(String name) {
super();
this.name = name;
}
/* (non-Javadoc)
* #see java.lang.Object#toString()
*/
#Override
public String toString() {
return "Oya [name=" + name + "]";
}
}
public static void main(String[] args) throws GridException {
try (Grid grid = GridGain.start(
System.getProperty("user.home") + "/gridgain-platform-os-6.1.9-nix/examples/config/example-cache.xml")) {
GridCache<Integer, Oya> cache = grid.cache("partitioned");
boolean success2 = cache.putxIfAbsent(3, new Oya("3"));
log.info("Current 3 value = {}", cache.get(3));
cache.transform(3, (it) -> new Oya(it.name + "-transformed"));
log.info("Transformed 3 value = {}", cache.get(3));
}
}
Start another GridGain node.
Run the code. It should print: 3-transformed
Comment the putxIfAbsent() code.
Run the code. I expected it to print: 3-transformed but got null instead
The code will work if I change the cache value to a String (like in GridGain Basic Operations video) or a Java builtin value, but not for my own custom class.
Peer-deployment for data grid is a development-only feature. The contract of a SHARED mode is that whenever last node which had original class definition leaves, all classes will be undeployed. For Data Grid it means that the cache will be cleared. This is useful for cases when you change class definitions.
In CONTINUOUS mode, cache classes never get undeployed, but in this case you must be careful not to change definitions of classes without restarting the grid nodes.
For more information see Deployment Modes documentation.
I have an app that uses the "task:scheduler" and "task:scheduled-tasks" elements (the latter containing "task:scheduled" elements). This is all working fine.
I'm trying to write some code that introspects the "application configuration" to get a short summary of some important information, like what tasks are scheduled and what their schedule is.
I already have a class that has a bunch of "#Autowired" instance variables so I can iterate through all of this. It was easy enough to add a "List" to get all of the TaskScheduler objects. I only have two of these, and I have a different set of scheduled tasks in each of them.
What I can't see in those TaskScheduler objects (they are actually ThreadPoolTaskScheduler objects) is anything that looks like a list of scheduled tasks, so I'm guessing the list of scheduled tasks is recorded somewhere else.
What objects can I use to introspect the set of scheduled tasks, and which thread pool they are in?
This functionality will be available in Spring 4.2
https://jira.spring.io/browse/SPR-12748 (Disclaimer: I reported this issue and contributed code towards its solution).
// Warning there may be more than one ScheduledTaskRegistrar in your
// application context. If this is the case you can autowire a list of
// ScheduledTaskRegistrar instead.
#Autowired
private ScheduledTaskRegistrar scheduledTaskRegistrar;
public List<Task> getScheduledTasks() {
List<Task> result = new ArrayList<Task>();
result.addAll(this.scheduledTaskRegistrar.getTriggerTaskList());
result.addAll(this.scheduledTaskRegistrar.getCronTaskList());
result.addAll(this.scheduledTaskRegistrar.getFixedRateTaskList());
result.addAll(this.scheduledTaskRegistrar.getFixedDelayTaskList());
return result;
}
// You can this inspect the tasks,
// so for example a cron task can be inspected like this:
public List<CronTask> getScheduledCronTasks() {
List<CronTask> cronTaskList = this.scheduledTaskRegistrar.getCronTaskList();
for (CronTask cronTask : cronTaskList) {
System.out.println(cronTask.getExpression);
}
return cronTaskList;
}
If you are using a ScheduledMethodRunnable defined in XML:
<task:scheduled method="run" cron="0 0 12 * * ?" ref="MyObject" />
You can access the underlying target object:
ScheduledMethodRunnable scheduledMethodRunnable = (ScheduledMethodRunnable) task.getRunnable();
TargetClass target = (TargetClass) scheduledMethodRunnable.getTarget();
I have a snippet for pre spring 4.2 since it is still sitting at release candidate level.
The scheduledFuture interface is implemented by every runnable element in the BlockingQueue.
Map<String, ThreadPoolTaskScheduler> schedulers = applicationContext
.getBeansOfType(ThreadPoolTaskScheduler.class);
for (ThreadPoolTaskScheduler scheduler : schedulers.values()) {
ScheduledExecutorService exec = scheduler.getScheduledExecutor();
ScheduledThreadPoolExecutor poolExec = scheduler
.getScheduledThreadPoolExecutor();
BlockingQueue<Runnable> queue = poolExec.getQueue();
Iterator<Runnable> iter = queue.iterator();
while (iter.hasNext()) {
ScheduledFuture<?> future = (ScheduledFuture<?>) iter.next();
future.getDelay(TimeUnit.MINUTES);
Runnable job = iter.next();
logger.debug(MessageFormat.format(":: Task Class is {0}", JobDiscoverer.findRealTask(job)));
}
Heres a reflective way to get information about which job class is in the pool as threadPoolNamePrefix didn't return a distinct name for me:
public class JobDiscoverer {
private final static Field syncInFutureTask;
private final static Field callableInFutureTask;
private static final Class<? extends Callable> adapterClass;
private static final Field runnableInAdapter;
private static Field reschedulingRunnable;
private static Field targetScheduledMethod;
static {
try {
reschedulingRunnable = Class
.forName(
"org.springframework.scheduling.support.DelegatingErrorHandlingRunnable")
.getDeclaredField("delegate");
reschedulingRunnable.setAccessible(true);
targetScheduledMethod = Class
.forName(
"org.springframework.scheduling.support.ScheduledMethodRunnable")
.getDeclaredField("target");
targetScheduledMethod.setAccessible(true);
callableInFutureTask = Class.forName(
"java.util.concurrent.FutureTask$Sync").getDeclaredField(
"callable");
callableInFutureTask.setAccessible(true);
syncInFutureTask = FutureTask.class.getDeclaredField("sync");
syncInFutureTask.setAccessible(true);
adapterClass = Executors.callable(new Runnable() {
public void run() {
}
}).getClass();
runnableInAdapter = adapterClass.getDeclaredField("task");
runnableInAdapter.setAccessible(true);
} catch (NoSuchFieldException e) {
throw new ExceptionInInitializerError(e);
} catch (SecurityException e) {
throw new PiaRuntimeException(e);
} catch (ClassNotFoundException e) {
throw new PiaRuntimeException(e);
}
}
public static Object findRealTask(Runnable task) {
if (task instanceof FutureTask) {
try {
Object syncAble = syncInFutureTask.get(task);
Object callable = callableInFutureTask.get(syncAble);
if (adapterClass.isInstance(callable)) {
Object reschedulable = runnableInAdapter.get(callable);
Object targetable = reschedulingRunnable.get(reschedulable);
return targetScheduledMethod.get(targetable);
} else {
return callable;
}
} catch (IllegalAccessException e) {
throw new IllegalStateException(e);
}
}
throw new ClassCastException("Not a FutureTask");
}
With #Scheduled based configuration the approach from Tobias M’s answer does not work out-of-the-box.
Instead of autowiring a ScheduledTaskRegistrar instance (which is not available for annotation based configuration), you can instead autowire a ScheduledTaskHolder which only has a getScheduledTasks() method.
Background:
The ScheduledAnnotationBeanPostProcessor used to manage #Scheduled tasks has an internal ScheduledTaskRegistrar that’s not available as a bean. It does implement ScheduledTaskHolder, though.
Every Spring XML element has a corresponding BeanDefinitionReader. For <task:scheduled-tasks>, that's ScheduledTasksBeanDefinitionParser.
This uses a BeanDefinitionBuilder to create a BeanDefinition for a bean of type ContextLifecycleScheduledTaskRegistrar. Your scheduled tasks will be stored in that bean.
The tasks will be executing in either a default TaskScheduler or one you provided.
I've given you the class names so you can look at the source code yourself if you need more fine grained details.
I've read several questions similar to this, but none of the answers provide ideas of how to clean up memory while still maintaining lock integrity. I'm estimating the number of key-value pairs at a given time to be in the tens of thousands, but the number of key-value pairs over the lifespan of the data structure is virtually infinite (realistically it probably wouldn't be more than a billion, but I'm coding to the worst case).
I have an interface:
public interface KeyLock<K extends Comparable<? super K>> {
public void lock(K key);
public void unock(K key);
}
with a default implementation:
public class DefaultKeyLock<K extends Comparable<? super K>> implements KeyLock<K> {
private final ConcurrentMap<K, Mutex> lockMap;
public DefaultKeyLock() {
lockMap = new ConcurrentSkipListMap<K, Mutex>();
}
#Override
public void lock(K key) {
Mutex mutex = new Mutex();
Mutex existingMutex = lockMap.putIfAbsent(key, mutex);
if (existingMutex != null) {
mutex = existingMutex;
}
mutex.lock();
}
#Override
public void unock(K key) {
Mutex mutex = lockMap.get(key);
mutex.unlock();
}
}
This works nicely, but the map never gets cleaned up. What I have so far for a clean implementation is:
public class CleanKeyLock<K extends Comparable<? super K>> implements KeyLock<K> {
private final ConcurrentMap<K, LockWrapper> lockMap;
public CleanKeyLock() {
lockMap = new ConcurrentSkipListMap<K, LockWrapper>();
}
#Override
public void lock(K key) {
LockWrapper wrapper = new LockWrapper(key);
wrapper.addReference();
LockWrapper existingWrapper = lockMap.putIfAbsent(key, wrapper);
if (existingWrapper != null) {
wrapper = existingWrapper;
wrapper.addReference();
}
wrapper.addReference();
wrapper.lock();
}
#Override
public void unock(K key) {
LockWrapper wrapper = lockMap.get(key);
if (wrapper != null) {
wrapper.unlock();
wrapper.removeReference();
}
}
private class LockWrapper {
private final K key;
private final ReentrantLock lock;
private int referenceCount;
public LockWrapper(K key) {
this.key = key;
lock = new ReentrantLock();
referenceCount = 0;
}
public synchronized void addReference() {
lockMap.put(key, this);
referenceCount++;
}
public synchronized void removeReference() {
referenceCount--;
if (referenceCount == 0) {
lockMap.remove(key);
}
}
public void lock() {
lock.lock();
}
public void unlock() {
lock.unlock();
}
}
}
This works for two threads accessing a single key lock, but once a third thread is introduced the lock integrity is no longer guaranteed. Any ideas?
I don't buy that this works for two threads. Consider this:
(Thread A) calls lock(x), now holds lock x
thread switch
(Thread B) calls lock(x), putIfAbsent() returns the current wrapper for x
thread switch
(Thread A) calls unlock(x), the wrapper reference count hits 0 and it gets removed from the map
(Thread A) calls lock(x), putIfAbsent() inserts a new wrapper for x
(Thread A) locks on the new wrapper
thread switch
(Thread B) locks on the old wrapper
How about:
LockWrapper starts with a reference count of 1
addReference() returns false if the reference count is 0
in lock(), if existingWrapper != null, we call addReference() on it. If this returns false, it has already been removed from the map, so we loop back and try again from the putIfAbsent()
I would use a fixed array by default for a striped lock, since you can size it to the concurrency level that you expect. While there may be hash collisions, a good spreader will resolve that. If the locks are used for short critical sections, then you may be creating contention in the ConcurrentHashMap that defeats the optimization.
You're welcome to adapt my implementation, though I only implemented the dynamic version for fun. It didn't seem useful in practice so only the fixed was used in production. You can use the hash() function from ConcurrentHashMap to provide a good spreading.
ReentrantStripedLock in,
http://code.google.com/p/concurrentlinkedhashmap/wiki/IndexableCache