Hazelcast configure file storage - caching

I have a need for a java-cache with file storage, that survives JVM crashes.
Previously I used ehcache, configured with .heap().disk().
However, it has a problem with unclear JVM shutdowns - next startup clears the store.
My only requirement is that at least parts of the data survive a restart.
I tried to use hazelcast, however with following code snippet, even subsequent run of the program returns prints "null".
Please suggest how to configure hazelcast, so that cache.put is written to a disk and loaded on startup.
public class HazelcastTest {
public static void main(String[] args) throws InterruptedException {
System.setProperty("hazelcast.jcache.provider.type", "server");
Config config = new Config();
HotRestartPersistenceConfig hotRestartPersistenceConfig = new HotRestartPersistenceConfig()
.setEnabled(true)
.setBaseDir(new File("cache"))
.setBackupDir(new File("cache/backup"))
.setParallelism(1)
.setClusterDataRecoveryPolicy(HotRestartClusterDataRecoveryPolicy.FULL_RECOVERY_ONLY);
config.setHotRestartPersistenceConfig(hotRestartPersistenceConfig);
HazelcastInstance instance = Hazelcast.newHazelcastInstance(config);
CacheConfig<String, String> cacheConfig = new CacheConfig<>();
cacheConfig.getHotRestartConfig().setEnabled(true);
cacheConfig.getHotRestartConfig().setFsync(true);
CachingProvider cachingProvider = Caching.getCachingProvider();
Cache<String, String> data = cachingProvider.getCacheManager().createCache("data", cacheConfig);
System.out.println(data.get("test"));
data.put("test", "value");
data.close();
instance.shutdown();
}
}
Suggestions for other frameworks that could complete the task are also welcome.

#Igor, Hot Restart is an Enterprise Feature of Hazelcast. You need to use Hazelcast Enterprise edition with a valid License Key.
Do you really need to store in a file, or just persist cache data somewhere else? If you can use a database, you can use MapStore which is available in Open Source version & write data to a persistent data store. You can even use write-behind mode to speed up writes.
See these sample project: https://github.com/hazelcast/hazelcast-code-samples/tree/master/distributed-map/mapstore

Related

Redis cache metrics with Prometheus(Spring boot)

I am using RedisTemplate for caching purpose in my spring boot service. Now I want to check cache hit/cache miss through end point actuator/prometheus. But can not see cache hit/cache miss for the cache.
The code I have written is something like below
#EnableCaching
#Configuration
public class CachingConfiguration {
#Bean
public RedisTemplate<String, SomeData> redisTemplate(LettuceConnectionFactory connectionFactory, ObjectMapper objectMapper)
{
RedisTemplate<String, SomeData> template = new RedisTemplate<>();
template.setConnectionFactory(connectionFactory);
var valueSerializer = new Jackson2JsonRedisSerializer<SomeData>(SomeData.class);
valueSerializer.setObjectMapper(objectMapper);
template.setValueSerializer(valueSerializer);
return template;
}
}
Now am doing like below to get and save into cache
to get:-
redisTemplate.opsForValue().get(key);
And to save:-
redisTemplate.opsForValue().set(key, obj, some_time_limit);
My cache is working properly, am getting able to save into cache and getting proper data.
But I don't see cache hit/miss related data inside actuator/prometheus.
In my application.yml file I have added below
cache:
redis:
enable-statistics: 'true'
I would assume that in order for Springboot Cache Monitoring to apply (Including Hits/Misses), you would need to depend on AutoConfiguration.
In your case you are creating the RedisTemplate yourself, and probably enable-statistics is not actually applied.
Can you remove the redistemplate creation and use #Cacheable annotation abstraction? That way any supported Cache library will work out of the box, without you having to create #Bean and manually configuring it.
Otherwise, generally if you wanted to enable statistics on a cache manager manually, you will need to call RedisCacheManager.RedisCacheManagerBuilder enableStatistics():
https://docs.spring.io/spring-data/redis/docs/current/api/org/springframework/data/redis/cache/RedisCacheManager.RedisCacheManagerBuilder.html
For Reference:
Auto-configuration enables the instrumentation of all available Cache
instances on startup, with metrics prefixed with cache. Cache
instrumentation is standardized for a basic set of metrics.
Additional, cache-specific metrics are also available.
Metrics are tagged by the name of the cache and by the name of the
CacheManager, which is derived from the bean name.
Only caches that are configured on startup are bound to the registry. For caches not
defined in the cache’s configuration, such as caches created on the
fly or programmatically after the startup phase, an explicit
registration is required. A CacheMetricsRegistrar bean is made
available to make that process easier.
I had exactly the same question and spent a good number of hours trying to figure out how to enable cache metrics for my manually created RedisTemplate instance.
What I eventually realised is that it's only RedisCache class which collects and exposes CacheStatistics through getStatistics() method. As far as I can see there is nothing like that for RedisTemplate, which means you either need to switch to using RedisCache through RedisCacheManager and #Cacheable annotation or implement your custom metrics collection.

How to run Hadoop as part of test suite of Spring application?

I would like to set up a simple "Hello, World!" to get an understanding of how to use basic Hadoop functionality such storing/reading files using HDFS.
Is it possible to:
Run an embedded Hadoop as part of my application?
Run an embedded Hadoop as part of my tests?
I would like to put together a minimal Spring Boot set up for this. What is the minimal Spring configuration required for this? There are sufficient examples illustrating how to read/write files using HDFS, but I still haven't been able to work out the what I need as Spring configuration. It's a bit hard to figure out what libraries one really needs, as it seems that the Spring Hadoop examples are out of date. Any help would be much appreciated.
You can easily use Hadoop Filesystem API 1 2 with any local POSIX filesystem without Hadoop cluster.
The Hadoop API is very generic and provides many concrete implementations for different storage systems such as HDFS, S3, Azure Data Lake Store, etc.
You can embed HDFS within your application (i.e run Namenode and Datanodes withing single JVM process), but this is only reasonable for tests.
There is Hadoop Minicluster which you can start from command-line (CLI MiniCluster) 3 or via Java API in your unit-tests with MiniDFSCluster class 4 found in hadoop-minicluster package.
You can start Mini Cluster with Spring by making a separate configuration for it and using it as #ContextConfiguration with your unit tests.
#org.springframework.context.annotation.Configuration
public class MiniClusterConfiguration {
#Bean(name = "temp-folder", initMethod = "create", destroyMethod = "delete")
public TemporaryFolder temporaryFolder() {
return new TemporaryFolder();
}
#Bean
public Configuration configuration(final TemporaryFolder temporaryFolder) {
final Configuration conf = new Configuration();
conf.set(
MiniDFSCluster.HDFS_MINIDFS_BASEDIR,
temporaryFolder.getRoot().getAbsolutePath()
);
return conf;
}
#Bean(destroyMethod = "shutdown")
public MiniDFSCluster cluster(final Configuration conf) throws IOException {
final MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf)
.clusterId(String.valueOf(this.hashCode()))
.build();
cluster.waitClusterUp();
return cluster;
}
#Bean
public FileSystem fileSystem(final MiniDFSCluster cluster) throws IOException {
return cluster.getFileSystem();
}
#Bean
#Primary
#Scope(BeanDefinition.SCOPE_PROTOTYPE)
public Path temp(final FileSystem fs) throws IOException {
final Path path = new Path("/tmp", UUID.randomUUID().toString());
fs.mkdirs(path);
return path;
}
}
You will inject FileSystem and a temporary Path into your tests, and as I've mentioned above, there's no difference from API stand point in either it's a real cluster, mini-cluster, or local filesystem. Note that there is a startup cost of this, so you likely want to annotated your tests with #DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD) in order to prevent cluster restart for each test.
In you want this code to run on Windows you will need a compatibility layer called wintuils 5 (which makes possible to access Windows filesystem in a POSIX way).
You have to point environment variable HADOOP_HOME to it, and depending on version load its shared library
String HADOOP_HOME = System.getenv("HADOOP_HOME");
System.setProperty("hadoop.home.dir", HADOOP_HOME);
System.setProperty("hadoop.tmp.dir", System.getProperty("java.io.tmpdir"));
final String lib = String.format("%s/lib/hadoop.dll", HADOOP_HOME);
System.load(lib);

Hazelcast Cache Manager: Cannot overwrite a Cache's CacheManager

On an application I am working I am trying to upgrade from Hazelcast 3.6 to 3.12.4 and I am encountering some problems which reproduce easily when two or more tests are ran together. The tests are all annotated with #WebAppConfiguration and include the Spring's application configuration using ContextConfiguration(classes = {AppConfig.class})
As part of the configuration, I have a #Bean that called CacheAwareStorage that initiates the CacheManager. THe initialization is quite basic:
public Cache<T, V> initCache(String name, Class<T> type, Class<T> valueType) {
Cache<T, V> cache = manager.getCache(cacheName, keyType, valueType);
if (cache != null)
{
return cache;
}
cache = manager.createCache(cacheName, config);
return cache;
}
The problem occurs when the context is refreshed as part of the test suit, which I think is done in AbstractTestNGSpringContextTests since I don't explicitly refresh the context. The following error occurs which result sin only the first class of tests to pass:
GenericWebApplicationContext: Refreshing org.springframework.web.context.support.GenericWebApplicationContext#6170989a
....
WARN GenericWebApplicationContext: Exception encountered during context initialization - cancelling refresh attempt
....
Factory method 'tokenStore' threw exception
nested exception is java.lang.IllegalStateException: Cannot overwrite a Cache's CacheManager.
Looking over what has changed, I see that the AbstracthazelcastCacheManager throws an IllegalStateException which comes from the Hazelcast CacheProxy. To be more precise, the manager.getCache() -> getCacheUnchecked() -> creates a cache proxy in createCacheProxy() -> and set's the proxy's manager to the current manager in cacheProxy.setCacheManager().
Starting with Hazelcast v3.9, this is no longer allowed once the manager has already been set.
What would be a solution for this? It may be that there is a bug in Hazelcast (there is no check if the manager that is being set is actually different than the already existing one), however I am looking for something that I can do on my side. Why the 'getCache()' tries to re-create the proxy is another thing that I do not understand.
I assume that I must do something so that the Context is not refreshed, however I don't know how (if at all) I can do that.
The problem was due to the way the Cache manager Bean was created. I used the internal Hazelcast cache manager and a new instance was created each time. Using the JCache API as bellow, solved the problem
#Bean
public CacheManager cacheManager() {
HazelcastServerCachingProvider provider = Caching.getCachingProvider(); // or add class name of Hazelcast server caching provider to disambiguate
return provider.getCacheManager(null, null, HazelcastCachingProvider.propertiesByInstanceItself(HAZELCAST_INSTANCE));
}
Help received from Hazelcast team on this: https://github.com/hazelcast/hazelcast/issues/16212

Develop programmatically a Jgroup Channel for Infinispan in a Cluster

I'm working with infinispan 8.1.0 Final and Wildfly 10 in a cluster set up.
Each server is started running
C:\wildfly-10\bin\standalone.bat --server-config=standalone-ha.xml -b 10.09.139.215 -u 230.0.0.4 -Djboss.node.name=MyNode
I want to use Infinispan in distributed mode in order to have a distributed cache. But for mandatory requirements I need to build a JGroups channel for dynamically reading some properties from a file.
This channel is necessary for me to build a cluster-group based on TYPE and NAME (for example Type1-MyCluster). Each server who wants to join a cluster has to use the related channel.
Sailing the net I have found some code like the one below:
public class JGroupsChannelServiceActivator implements ServiceActivator {
#Override
public void activate(ServiceActivatorContext context) {
stackName = "udp";
try {
channelServiceName = ChannelService.getServiceName(CHANNEL_NAME);
createChannel(context.getServiceTarget());
} catch (IllegalStateException e) {
log.log(Level.INFO, "channel seems to already exist, skipping creation and binding.");
}
}
void createChannel(ServiceTarget target) {
InjectedValue<ChannelFactory> channelFactory = new InjectedValue<>();
ServiceName serviceName = ChannelFactoryService.getServiceName(stackName);
ChannelService channelService = new ChannelService(CHANNEL_NAME, channelFactory);
target.addService(channelServiceName, channelService)
.addDependency(serviceName, ChannelFactory.class, channelFactory).install();
}
I have created the META-INF/services/....JGroupsChannelServiceActivator file.
When I deploy my war into the server, the operation fails with this error:
"{\"WFLYCTL0180: Services with missing/unavailable dependencies\" => [\"jboss.jgroups.channel.clusterWatchdog is missing [jboss.jgroups.stack.udp]\"]}"
What am I doing wrong?
How can I build a channel the way I need?
In what way I can tell to infinispan to use that channel for distributed caching?
The proposal you found is implementation dependent and might cause a lot of problems during the upgrade. I wouldn't recommend it.
Let me check if I understand your problem correctly - you need to be able to create a JGroups channel manually because you use some custom properties for it.
If that is the case - you could obtain a JGroups channel as suggested here. But then you obtain a JChannel instance which is already connected (so this might be too late for your case).
Unfortunately since Wildfly manages the JChannel (it is required for clustering sessions, EJB etc) the only way to get full control of JChannel creating process is using Infinispan embedded (library) mode. This would require adding infinispan-embedded into your WAR dependencies. After that you can initialize it similarly to this test.

Can I use the JCache API for distributed caches in Apache Ignite?

I would like to configure a distributed cache with Apache Ignite using the JCache API (JSR107, javax.cache). Is this possible?
The examples I have found either create a local cache with the JCache API or create a distributed cache (or datagrid) using the Apache Ignite API.
JCache allows to provide provider-specific configuration when creating a cache. I.e., you can do this:
// Get or create a cache manager.
CacheManager cacheMgr = Caching.getCachingProvider().getCacheManager();
// This is an Ignite configuration object (org.apache.ignite.configuration.CacheConfiguration).
CacheConfiguration<Integer, String> cfg = new CacheConfiguration<>();
// Specify cache mode and/or any other Ignite-specific configuration properties.
cfg.setCacheMode(CacheMode.PARTITIONED);
// Create a cache based on configuration create above.
Cache<Integer, String> cache = cacheMgr.createCache("a", cfg);
Also note that partitioned mode is actually the default one in Ignite, so you are not required to specify it explicitly.
UPD. In addition, CachingProvider.getCacheManager(..) method accepts a provider-specific URI that in case of Ignite should point to XML configuration file. Discovery, communication and other parameters can be provided there.
Please note that JCache specification does not specify all the configurations that apply to individual cache providers in terms of configuring via CacheManager for creating a Grid. The requirement for creating a CacheManager is standard but not everything relevant to how the manager itself is configured.
Following code will demonstrate how to create a grid using Apache Ignite in SpringBoot
#Bean
#SuppressWarnings("unchecked")
public org.apache.ignite.cache.spring.SpringCacheManager cacheManager() {
IgniteConfiguration igniteConfiguration = new IgniteConfiguration();
igniteConfiguration.setGridName("petclinic-ignite-grid");
//igniteConfiguration.setClassLoader(dynamicClassLoaderWrapper());
igniteConfiguration.setCacheConfiguration(this.createDefaultCache("petclinic"),
this.createDefaultCache("org.hibernate.cache.spi.UpdateTimestampsCache"),
this.createDefaultCache("org.hibernate.cache.internal.StandardQueryCache"));
SpringCacheManager springCacheManager = new SpringCacheManager();
springCacheManager.setConfiguration(igniteConfiguration);
springCacheManager.setDynamicCacheConfiguration(this.createDefaultCache(null));
return springCacheManager;
}
private org.apache.ignite.configuration.CacheConfiguration createDefaultCache(String name) {
org.apache.ignite.configuration.CacheConfiguration cacheConfiguration = new org.apache.ignite.configuration.CacheConfiguration();
cacheConfiguration.setName(name);
cacheConfiguration.setCacheMode(CacheMode.PARTITIONED);
cacheConfiguration.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL);
cacheConfiguration.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);
cacheConfiguration.setStatisticsEnabled(true);
cacheConfiguration.setEvictSynchronized(true);
return cacheConfiguration;
}
}
If we were to create another instance of this service and have it register to the same grid as igniteConfiguration.setGridName("petclinic-ignite-grid"), an IMDG will be created. Please note that the 2 service instances with this version of partitioned, embedded distributed cache should be able to talk to each other via required PORTS. Please refer to Apache Ignite - Data Grid for more details.
Hope this helps.

Resources