Graceful invalidation on Redis - caching

I'm trying to find a product like Varnish that can give me the possibility to handle grafecul invalidation on cache, which basically is the ability to guarantee cache time to the client because when a key value is invalid or expired, isn not the client itself to get the content from the origin having to wait a long time, but it's always the cache system which do it for the client separately, in the meantime the client only gets the cache content even if it's invalid.
Example of the scenarios:
Scenario where the cache value is valid.
1) Client -> cache valid -> cached object
Scenario where the cache value is invalid.
1) Client -> cache invalid -> old cache object
2) Caching system -> origin -> replace old cache object
is there any way to do this prefetch ensuring the client a cache response time with Redis?

You need to handle how you are invalidating a key-value. After that:
Scenario where the cache value is invalid.
1) Client -> cache invalid -> old cache object
2) Caching system -> origin -> replace old cache object
If you already know that the key is invalid/expired then Redis has an option to get previous and set the new value to cache. GETSET key value
Example:
redis> SET mykey "Hello"
"OK"
redis> GETSET mykey "World"
"Hello"
redis> GET mykey
"World"
redis>

Related

ehcache load from DB and read from local disk

How do I force the ehcache to load all the data from DB once, after that i need to read the values from ehcache.
I am seeing examples in which every new search goes to db first and then next hit from cache.
getProduct("1") - goes to db - ok
getProduct("1") - goes to cache - ok
getProduct("2") - goes to db - **instead i want this from cache**
getProduct("2") - goes to cache - ok
Please advice.
If you want up-front loading of a set of information in the cache, this is something you need your application to trigger.
The cache itself does not know the valid values to getProduct and so cannot prefetch them on its own.

using Redis in Openstack Keystone, some Rubbish in redis

Recently, I'm using Redis to cache token for OpenStack Keystone. The function is fine, but some expired cache data still in Redis.
my Keystone config:
[cache]
enabled=true
backend=dogpile.cache.redis
backend_argument=url:redis://127.0.0.1:6379
[token]
provider = uuid
caching=true
cache_time= 3600
driver = kvs
expiration = 3600
but some expired data in Redis:
Data was over expiration time, but still in here, because the TTL is -1.
My question:
How can I change settings to stop this rubbish data created?
Is some gracefully way to clean it up?
I was trying to use command 'keystone-manage token_flush', but after reading code, I realized this command just clean up the expired tokens in Mysql
I hope this question still relevant.
I'm trying to do the same thing as you are, and for now the only option I found working is the argument on dogpile.cache.redis: redis_expiration_time.
Checkout the backend dogpile.redis API or source code.
http://dogpilecache.readthedocs.io/en/latest/api.html#dogpile.cache.backends.redis.RedisBackend.params.redis_expiration_time
The only problem with this argument is that it does not let you choose a different TTL for different categories, for example you want tokens for 10 minutes and catalog for 24 hours or so. The other parameters on keystone.conf just don't work from my experience (expiration_time and cache_time on each category)... Anyway this problem isn't relevant if you are using redis to store only keystone tokens.
[cache]
enabled=true
backend=dogpile.cache.redis
backend_argument=url:redis://127.0.0.1:6379
// Add this line
backend_argument=redis_expiration_time:[TTL]
Just replace the [TTL] with your wanted ttl and you'll start noticing keys with ttl in redis and after a while you will see that they are no more.
about the second question:
This is maybe not the best answer you'll see, but you can use OBJECT idletime [key] command on redis-cli to see how much time the specific key wasn't used (even GET reset idletime). You can delete the keys that have bigger idletime than your token revocation using a simple script.
Remember that the data on Redis isn't persistent data, meaning you can always use FLUSHALL and your OpenStack and keystone will work as usual, but ofc the first authentications will take longer.

Dealing with Memcached Race Conditions

I have two different sources of data which I need to marry together. Data set A will have a foo_key attribute which can map to Data set B's bar_key attribute with a one to many relationship.
Data set A:
[{ foo_key: 12345, other: 'blahblah' }, ...]
Data set B:
[{ bar_key: 12345, other: '' }, { bar_key: 12345, other: '' }, { bar_key: 12345, other: '' }, ...]
Data set A is coming from a SQS queue and any relationships with data set B will be available as I poll A.
Data set B is coming from a separate SQS queue that I am trying to dump into a memcached cache to do quick look ups on when an object drops into data set A.
Originally I was planning on setting the memcached key to be bar_key from the objects in data set B but then realized that if I did that it would be possible to overwrite the value since there can be many of the same bar_key value. Then I was thinking well I can create a key bar_key and the value just be an array of the SQS messages. But since I have multiple hosts polling the SQS queue I think it might be possible that when I check to see if the key is in memcached, check it out, append the new message to it, and then set it, that another host could be trying to preform the same operation and thus the first host's attempt at appending the value would just be overwritten.
I've looked around at memcached key locking but I'm not sure I understand it entirely. Would the solution be that when I get the key/value pair from memcached I create a temporary dummy lock on a new key called bar_key_dummy that expires in x seconds, and if I try to fetch a key that has a bar_key_dummy lock active I just send the SQS message back to the queue without deleting to try again in x seconds?
Here's some pseudocode for what I have going on in my head. Does this make any sense?
store = MemCache.new(host)
sqs_messages.poll do |message|
dummy_key = "#{message.bar_key}_dummy"
sqs.dont_delete_message && next unless store.get(dummy_key).nil?
# set dummy_key in memcache with a value of 1 for 3 seconds
store.set(dummy_key, 1, 3)
temp_data = store.get(message.bar_key) || []
temp_data << message
store.set(message.bar_key, temp_data, 300)
# delete dummy key when done in case shorter than x seconds
store.delete(dummy_key)
end
Thanks for any help!
Memcached has a special operation - cas Compare and Swap.
Command gets returns Item along with its unique CAS value.
Then dataset can be searched and update must be issued with the cas command which takes original unique CAS value.
If CAS was changed in between two command, update operation will fail with the EXISTS error

stale session data - websphere

I'm having a stale attribute with the http session within Websphere 6 and may be related to in memory session replication..
Steps:
Object A.0 - Placed into the session with ID "ABC"
Remove A.0 from the session..
Object A.1 (New instance) - placed into Session with ID "ABC"
retrieve object with ID "ABC" from the session - RESULT: A.1 (Correct)
carry out a Servlet forward or a redirect (issues seen on both functions)..
retrieve "ABC" from the session - RESULT: A.0, the object that was removed from the session..
Notes -
Same Session object (hashcode/session ID) used in steps 1-5 using in
memory replication across 2 JVMs (single cluster)
time duration between steps 2 & 5 is total of 4 seconds
No other external threads have accessed the session in the interim..
Only noticed for 1 specific use-case; haven't encountered this in
other use-cases..
Anyone seen anything like this before where a stale data is being returned from the websphere application server?
Thanks,
Ian.
Are you explicitly writing the changed object back to the session before you forward/redirect? In at least some versions of WebSphere, in some configurations, you must do this to ensure the change is "committed".
(If I find a clear reference for this, I'll update my answer.)

Cache consistency when using memcached and a rdbms like MySQL

I have taken a database class this semester and we are studying about maintaining cache consistency between the RDBMS and a cache server such as memcached. The consistency issues arise when there are race conditions. For example:
Suppose I do a get(key) from the cache and there is a cache miss. Because I get a cache miss, I fetch the data from the database, and then do a put(key,value) into the cache.
But, a race condition might happen, where some other user might delete the data I fetched from the database. This delete might happen before I do a put into the cache.
Thus, ideally the put into the cache should not happen, since the data is longer present in the database.
If the cache entry has a TTL, the entry in the cache might expire. But still, there is a window where the data in the cache is inconsistent with the database.
I have been searching for articles/research papers which speak about this kind of issues. But, I could not find any useful resources.
This article gives you an interesting note on how Facebook (tries to) maintain cache consistency : http://www.25hoursaday.com/weblog/2008/08/21/HowFacebookKeepsMemcachedConsistentAcrossGeoDistributedDataCenters.aspx
Here's a gist from the article.
I update my first name from "Jason" to "Monkey"
We write "Monkey" in to the master database in California and delete my first name from memcache in California but not Virginia
Someone goes to my profile in Virginia
We find my first name in memcache and return "Jason"
Replication catches up and we update the slave database with my first name as "Monkey." We also delete my first name from Virginia memcache because that cache object showed up in the replication stream
Someone else goes to my profile in Virginia
We don't find my first name in memcache so we read from the slave and get "Monkey"
How about using a variable save in memcache as a lock signal?
every single memcache command is atomic
after you retrieved data from db, toggle lock on
after you put data to memcache, toggle lock off
before delete from db, check lock state
The code below gives some idea of how to use Memcached's operations add, gets and cas to implement optimistic locking to ensure consistency of cache with the database.
Disclaimer: i do not guarantee that it's perfectly correct and handles all race conditions. Also consistency requirements may vary between applications.
def read(k):
loop:
get(k)
if cache_value == 'updating':
handle_too_many_retries()
sleep()
continue
if cache_value == None:
add(k, 'updating')
gets(k)
get_from_db(k)
if cache_value == 'updating':
cas(k, 'value:' + version_index(db_value) + ':' + extract_value(db_value))
return db_value
return extract_value(cache_value)
def write(k, v):
set_to_db(k, v)
loop:
gets(k)
if cache_value != 'updated' and cache_value != None and version_index(cache_value) >= version_index(db_value):
break
if cas(k, v):
break
handle_too_many_retries()
# for deleting we can use some 'tumbstone' as a cache value
When you read, the following happens:
if(Key is not in cache){
fetch data from db
put(key,value);
}else{
return get(key)
}
When you write, the following happens:
1 delete/update data from db
2 clear cache

Resources