We are having a requirement , where we need to search for keys with multiple keys , and are looking for a multiple indexes .
For example:
Trade data contains the below parameters:
Date
Stock
Price
Quantity
Account
We will be storing each trade as a list with Stock as the key. This would give us the the ability to query , all the trades of a given stock. However , we would also have queries , like list of all the trades in an account. We would want to use this same cache to fetch this query instead of a new cache. The requirement is for an in memory cache(java) , as the latency requirement is very low. Also , we need a persistent cache , so that the cache is re-populated when the application is restarted.
Please let me know , if there is any good solution available , as the only way for persistent cache seems to be the distributed ones.
One way to ensure queries are faster is to create a TradeMeta Object with only the attributes you would like to query on ie
Date Stock Price Quantity Account
The TradeMeta Object can be stored in a Map with index on all the above keys . This ensure hazelcast maintains relevant buckets for easy lookup internally. Predicates can be set against this TradeMetaMap to fetch the keys . One you have the keys use getAsync to fetch the full trade objects from tradeMap.
To persist the cache you would require the Hazelcast EnterpriseHD which has HD storage and HotRestartStore
Implement cache in TIBCO BW
I need to implement cache/memory in TIBCO BW.
Contents of this cache should be available across BW projects.
What I am trying to do is, when I receive a message containing multiple shipment records - (Shipment and delivery No is unique combination)
I need to first check in cache if any of these records exist.
If yes - reject whole XML
If not, then push this data into cache/memory.
Once this is done then I need to call SOAP request reply to external system.
In Another project, when Acknowledgement is received from external system, I need to check the records in the message, find out those records in cache and delete them.
Is there any way to do this?
Challenge here is there is no unique key for a whole message. Each record with combination of shipment/delivery is unique.
Here is what I tried and challenge in it:
1) I thought of putting the data in a file and naming the file as the requestID/Key for each message.
Then in another project, check the file and delete it
But since we dont have key, I cannot do that.
2) Using shared variables:
I believe shared variables will not be available across bw projects. So, this option is out
3) third option is to use EMS queue, park the message temporarily there containing records.
Then search in this, and if the records match reject the request.
And, in acknowledgement (another project), search for the records in ems message and delete that particular message.
Any help on this would be appreciated.
thanks
Why don't you use database or file to store the records ?
Because when you stop or restart or a problem occured in the appnode, the cache will be erased and you will not be able to retreive the records not treated.
4th option is using tibco activity 'Java Global instance' for implementing cache in Java.
"The Java Global Instance shared configuration resource allows you to
specify a Java object that can be shared across all process instances
in a Java Virtual Machine (JVM). When the process engine is started,
an instance of the specified Java class is constructed."
You can google ton's of cache implementation in Java for example https://crunchify.com/how-to-create-a-simple-in-memory-cache-in-java-lightweight-cache/
Regarding your statement:
"Challenge here is there is no unique key for a whole message. Each record with combination of shipment/delivery is unique"
So, you do have unique key - combination of shipment/delivery is unique.
If record size is not too huge you can use the record as key "as is" or create unique hash for each record and use it as key if key size is an issue.
We have a unique requirement where we need to create fixed 12 digit unique number for every transaction we process successfully in our current application. The application is set of restful services and has Oracle DB as a data store.
We do have the logic as to how to come up with unique 12 digit number but we are trying to understand where we can fit this logic so that the transactions which are getting executed in this environment gets reference to this unique id.
We figured out that keeping some part of that 12 digit in DB sequence could be an option but that will not work in near future as we would be having multiple databases.
How about if you have a Sequencer service which is responsible for generating these unique numbers? When a new transaction is created, the entity which manages the transaction can request a unique number from this service and associate this with the transaction.
when memecached or Redis is used for data-storage caching. How is the cache being updated when the value changed?
For, example. If I read key1 from cache the first time and it missed, then I pull value1 and put key1=value1 into cache.
After that if the value of key1 changed to value2.
How is value in cache updated or invalidated?
Does that mean whenever there is a change on key1's value. Either the application or database need to check if this key1 is in cache and update it?
Since you are using a cache, you have to tolerate the data inconsistency problem, i.e. at some time point, data in cache is different from data in database.
You don't need to update the value in cache, whenever the value has been changed. Otherwise, the whole cache system will be very complicated (e.g. you have to maintain a list of keys that have been cached), and it also might be unnecessary to do that (e.g. the key-value might be used only once, and no need to update it any more).
How can we update the data in cache and keep the cache system simple?
Normally, besides setting or updating a key-value pair in cache, we also set a TIMEOUT for each key. After that, client can get the key-value pair from the cache. However, if a key reaches the timeout, the cache system removes the key-value pair from the cache. This is called THE KEY HAS BEEN EXPIRED. The next time, the client trying to get that key from cache, will get nothing. This is called CACHE MISS. In this case, client has to get the key-value pair from database, and update it to cache with a new timeout.
If the data has been updated in database, while the key has NOT been expired in cache, client will get inconsistent data. However, when the key has been expired, its value will be retrieved from database and inserted into cache by some client. After that, other clients will get updated data until the data has been changed again.
How to set the timeout?
Normally, there're two kinds of expiration policy:
Expire in N seconds/minutes/hours...
Expire at some future timepoint, e.g. expire at 2017/7/30 00:00:00
A large timeout can largely reduce the load of database, while the data might be out-of-date for a long time. A small timeout can keep the data up-to-date as much as possible, while the database will have a heavy load. So you have to balance the trade-off when designing the timeout.
How does Redis expire keys?
Redis has two ways to expire keys:
When client tries to operate on a key, Redis checks if the key has reached the timeout. If it does, Redis removes the key, and acts as if the key doesn't exist. In this way, Redis ensures that client doesn't get expired data.
Redis also has an expiration thread that samples keys at a configured frequency. If the keys reach the timeout, Redis removes these keys. In this way, Redis can accelerate the key expiration process.
You can simply empty the particular cache value in the api function where insertion or updation of that particular value is performed. This way the server will fetch the updated value in the next request because you had already emptied the cache value.
Here is a diagram which will make it easier for you to understand:
I had similar issue related to stale data esp. in two cases:
When i get bulk messages/events
In this (my) use case, I am writing score to Redis cache and reading it again in subsequent call. In case of bulk messages, due to weak consistency in Redis, data might not be replicated to all replicas when I request again to read the data against same key(which is generally few ms(1-2 ms).
Remediation:
In this case, I was getting stale data. In order to address that, used cache on cache i.e. Loading TTL cache on Redis Cache. Here, it used to check the data in loading cache first, if not present, it checks data in Redis cache. Once done, both the caches are being updated.
in distributed system(k8s) where I have multiple pods
(kafka is being used as messaging broker)
When went for above strategy, we have another problem, what if data for a key previously served by say pod1, reaches to pod2. This has bigger impact, as it leads to data inconsistencies.
Remediation:
Here kafka partition key was set as "key" which is set in Redis. This way, we are getting subsequent messages to a particular pod only. In case of restart of pods, cache will be build again.
This solved our problem.
Consider that I have configured 1 Mb of key-cache (Consider it can hold 13000 of keys ).
Then I wrote some records in a column family(say 20000).
Then read it at first (All keys sequentially in the same order used to write ), and keys are started to stored in key-cache.
When the read reached # 13000 the key cache is filled completely.
What will happen to the key-cache when the next keys are read? (Which key is removed for the newly read key ?).
Key-Cache following FIFO or LIFO or Random out ?.
Key cache uses ConcurrentLinkedHashMap underneath and hence its eviction policy is LRU (least recently used).
https://code.google.com/p/concurrentlinkedhashmap/#Features
https://code.google.com/p/concurrentlinkedhashmap/wiki/Design#Beyond_LRU