Changing hazelcast configuration at runtime - caching

Is it possible to change Hazelcast configuration at runtime and if so what parameters are modifiable.
It seems to be possible using Hazelcast Management Center but can't find any examples/references in official docos/forums.

Might be a bit late to answer your question but better late than never :)
You can modify some of the map config properties after the map has been created using the MapService:
HazelcastInstance instance = Hazelcast.newHazelcastInstance();
// create map
IMap<String, Integer> myMap = instance.getMap("myMap");
// create a new map config
MapConfig newMapConfig = instance.getConfig().getMapConfig("myMap").setAsyncBackupCount(1);
// submit the new map config to the map service
MapService mapService = (MapService)(((AbstractDistributedObject)instance.getDistributedObject(MapService.SERVICE_NAME, "")).getService());
mapService.getMapServiceContext().getMapContainer("myMap").setMapConfig(newMapConfig);
Note that this API is not visible/documented so it might not work in future versions.
We are using this in our application when we need to insert several million entries in a distributed map at startup. Disabling the backup cut the insertion time by 30%. After the data are inserted, we enable the backup.

The Hazelcast internals are not really designed to be modifiable. What do you want to modify?

Related

Clear remote Redis cache in Spring Boot application

I'm using Spring Boot 2.3 and I'm using the default cache mechanism using app.properties.
I defined all values:
spring.cache.type = redis
spring.redis.host = host
spring.redis.port = port
spring.redis.timeout = 4000
spring.redis.password = psw
spring.cache.redis.time-to-live = 28800000
I take advantage of the cache in Spring Repository for example:
#Cacheable(cacheNames = "contacts")
#Override
Page<Contact> findAll(Specification specification, Pageable pageable);
It works as expected. Redis, however is a cluster used from a couple of my applications and I need the second application is able to remove some/all keys in redis.
The applicazione A1 take advantage of the cache and put keys inside. The app A2, need to clear some keys or all keys.
In A2 I did:
cacheManager.getCacheNames().forEach(cacheName -> cacheManager.getCache(cacheName).clear());
but of course the list of cache names is empty becuase in this app I don't add keys to the cache and, anyway, I don't have the same keys of the A1.
I should list remote keys and then I need to clear them. Is there a simple way without using Spring Data Redis library?
You could define a separate prefix for entire your cache in the Redis. Something like a namespace for your cache entries.
And after you could flush all keys in this namespace.
Note: ensure that CacheManager has only Redis cache and doesn't have an in-memory cache (L1).

Hazelcast persisting and loading data on all nodes

I have a 2 node setup distributed cache setup which needs persistence setup for both members.
I have MapSore and Maploader implemented and the same code is deployed on both nodes.
The MapStore and MapLoader work absolutely ok on a single member setup, but after another member joins, MapStore and Maploader continue to work on the first member and all insert or updates by the second member are persisted to disk via the first member.
My requirement is that each member should be able to persist to disk independently so that distributed cache is backed up on all members and not just the first member.
Is there a setting I can change to achieve this.
Here is my Hazlecast Spring Configuration.
#Bean
public HazelcastInstance hazelcastInstance(H2MapStorage h2mapStore) throws IOException{
MapStoreConfig mapStoreConfig = new MapStoreConfig();
mapStoreConfig.setImplementation(h2mapStore);
mapStoreConfig.setWriteDelaySeconds(0);
YamlConfigBuilder configBuilder=null;
if(new File(hazelcastConfiglocation).exists()) {
configBuilder = new YamlConfigBuilder(hazelcastConfiglocation);
}else {
configBuilder = new YamlConfigBuilder();
}
Config config = configBuilder.build();
config.setProperty("hazelcast.jmx", "true");
MapConfig mapConfig = config.getMapConfig("requests");
mapConfig.setMapStoreConfig(mapStoreConfig);
return Hazelcast.newHazelcastInstance(config);
}
Here is my hazlecast yml config - This is placed in /opt/hazlecast.yml which is picked up by my spring config up above.
hazelcast:
group:
name: tsystems
management-center:
enabled: false
url: http://localhost:8080/hazelcast-mancenter
network:
port:
auto-increment: true
port-count: 100
port: 5701
outbound-ports:
- 0
join:
multicast:
enabled: false
multicast-group: 224.2.2.3
multicast-port: 54327
tcp-ip:
enabled: true
member-list:
- 192.168.1.13
Entire code is available here :
[https://bitbucket.org/samrat_roy/hazelcasttest/src/master/][1]
This might just be bad luck and low data volumes, rather than an actual error.
On each node, try the running the localKeySet() method and printing the results.
This will tell you which keys are on which node in the cluster. The node that owns key "X" will invoke the map store for that key, even if the update was initiated by another node.
If you have low data volumes, it may not be a 50/50 data split. At an extreme, 2 data records in a 2-node cluster could have both data records on the same node.
If you have a 1,000 data records, it's pretty unlikely that they'll all be on the same node.
So the other thing to try is add more data and update all data, to see if both nodes participate.
Ok after struggling a lot I noticed a teeny tiny buy critical detail.
Datastore needs to be a centralized system that is accessible from all Hazelcast members. Persistence to a local file system is not supported.
This is absolutely in line with what I was observing
[https://docs.hazelcast.org/docs/latest/manual/html-single/#loading-and-storing-persistent-data]
However not be discouraged, I found out that I could use event listeners to do the same thing I needed to do.
#Component
public class HazelCastEntryListner
implements EntryAddedListener<String,Object>, EntryUpdatedListener<String,Object>, EntryRemovedListener<String,Object>,
EntryEvictedListener<String,Object>, EntryLoadedListener<String,Object>, MapEvictedListener, MapClearedListener {
#Autowired
#Lazy
private RequestDao requestDao;
I created this class and hooked it into the config as so
MapConfig mapConfig = config.getMapConfig("requests");
mapConfig.addEntryListenerConfig(new EntryListenerConfig(entryListner, false, true));
return Hazelcast.newHazelcastInstance(config);
This worked flawlessly, I am able to replicate data over to both the embedded databases on each node.
My use case was to cover HA failover edge-cases. During HA failover, The slave node needed to know the working memory of the active node.
I am not using hazelcast as a cache, rather I am using as a data syncing mechanism.

Hazelcast distributed map: what is the default eviction policy?

I have a distributed map in Hazelcast, something like this:
ClientConfig clientConfig = new ClientConfig();
clientConfig.getGroupConfig().setName("clusterName").setPassword("clusterPWD");
clientConfig.getNetworkConfig().addAddress("X.X.X.X");
clientConfig.setInstanceName(InstanceName);
HazelcastInstance instance = HazelcastClient.newHazelcastClient(clientConfig);
[...]
map = instance.getMap("MAP_NAME");
[...]
// a lot of map.put();
[...]
// a lot of map.get();
I need to avoid OOM problems and clean the cache every time.
EDIT: It seems that the default policy is NOT EVICTION so it's necessary to clean the cache with some policy.
I tried adding an hazelcast-client.xml in classpath with this configuration
<near-cache name="wm_info">
<max-size>3</max-size>
<time-to-live-seconds>5</time-to-live-seconds>
<max-idle-seconds>5</max-idle-seconds>
<eviction-policy>LRU</eviction-policy>
<invalidate-on-change>true</invalidate-on-change>
<in-memory-format>OBJECT</in-memory-format>
</near-cache>
both adding this code
EvictionConfig evictionConfig = new EvictionConfig()
.setEvictionPolicy(EvictionPolicy.LRU)
.setSize(2);
NearCacheConfig nearCacheConfig = new NearCacheConfig()
.setName(WM_MAP_NAME)
.setInMemoryFormat(InMemoryFormat.BINARY)
.setInvalidateOnChange(true)
.setTimeToLiveSeconds(5)
.setEvictionConfig(evictionConfig);
clientConfig.addNearCacheConfig(nearCacheConfig);
but doesn't work... cache items still in cache even after some minutes.
EDIT2: The only way it seems to work is:
map.put(code, json, 5, TimeUnit.SECONDS);
Any alternatives?
Thanks
Andrea
by default eviction/expiration is not configured for map, you have to explicitly configure if you don't want your map to exceed a threshold. If you keep putting entries into a map with a default configuration you'll get OOM eventually.
Below is a map configuration which enables eviction with the policy least-recently-used. When map size reaches to the configured threshold, some of the entries will get evicted.
If you want to expire the entries too you may configure time-to-live-seconds and max-idle-seconds too.
<map name="default">
...
<time-to-live-seconds>0</time-to-live-seconds>
<max-idle-seconds>0</max-idle-seconds>
<eviction-policy>LRU</eviction-policy>
<max-size policy="PER_NODE">5000</max-size>
...
</map>
</hazelcast>
Take a look at the Map Eviction section of the documentation
https://docs.hazelcast.org/docs/3.11.2/manual/html-single/index.html#map-eviction
To add to what Ali commented, you have to add a size restriction of some sort on the cluster map side (number of entries, memory size, etc). Then you add an eviction policy to tell Hazelcast which entries to evict when it hits the threshold and needs to put in new values.

Vault configuration in springboot application

I'm working on micorservice using springboot . I have three questions here . Answers to any/all are much appreciated .Thanks in advance
Background: We need to read some key from vault during application startup and save it in variable for later use (to avoid hits on vault) . There will be TTL for this value so application should refresh and take whenever new value configured in vault.
Q1 : How to load and ensure values are loaded only once(i.e vault hit only once)
Q2 :How to get the new values whenever there is a change
Q3 : How to test locally.
Use guava cache to store values (assuming they are strings, but you can change it to any type) like this:
LoadingCache<String, String> vaultData = CacheBuilder.newBuilder()
.expireAfterAccess(10, TimeUnit.MINUTES)
.build(
new CacheLoader<String, String>() {
public String load(String key) throws AnyException {
return actuallyLoadFromVault(String);
}
});
This way when your code will read some key from vaultData for the first time it will loaded using actuallLoadFromVault (which you need to write of cause) and after that any new access to that key via vaultData will hit the cached value that is stored in memory.
With proper configuration after 10 minutes the value will be wiped from the cache (please read https://github.com/google/guava/wiki/CachesExplained#when-does-cleanup-happen and How does Guava expire entries in its CacheBuilder? to configure that correctly).
You might need to set max cache size to limit the memory consumption. See documentation for details.

Get Hbase region size via API

I am trying to write a balancer tool for Hbase which could balance regions across regionServers for a table by region count and/or region size (sum of storeFile sizes). I could not find any Hbase API class which returns the regions size or related info. I have already checked a few of the classes which could be used to get other table/region info, e.g. org.apache.hadoop.hbase.client.HTable and HBaseAdmin.
I am thinking, another way this could be implemented is by using one of the Hadoop classes which returns the size of the directories in the fileSystem, for e.g. org.apache.hadoop.fs.FileSystem lists the files under a particular HDFS path.
Any suggestions ?
I use this to do managed splits of regions, but, you could leverage it to load-balance on your own. I also load-balance myself to spread the regions ( of a given table ) evenly across our nodes so that MR jobs are evenly distributed.
Perhaps the code-snippet below is useful?
final HBaseAdmin admin = new HBaseAdmin(conf);
final ClusterStatus clusterStatus = admin.getClusterStatus();
for (ServerName serverName : clusterStatus.getServers()) {
final HServerLoad serverLoad = clusterStatus.getLoad(serverName);
for (Map.Entry<byte[], HServerLoad.RegionLoad> entry : serverLoad.getRegionsLoad().entrySet()) {
final String region = Bytes.toString(entry.getKey());
final HServerLoad.RegionLoad regionLoad = entry.getValue();
long storeFileSize = regionLoad.getStorefileSizeMB();
// other useful thing in regionLoad if you like
}
}
What's wrong with the default Load Balancer?
From the Wiki:
The balancer is a periodic operation which is run on the master to redistribute regions on the cluster. It is configured via hbase.balancer.period and defaults to 300000 (5 minutes).
If you really want to do it yourself you could indeed use the Hadoop API and more specifally, the FileStatus class. This class acts as an interface to represent the client side information for a file.

Resources