Redis namespacing basics - ruby

I am really new to Redis and have been using it along with my Ruby on Rails (Rails 2.3 and Ruby 1.8.7) application using the redis gem for simple tagging functionality as a key value store. I recently realized that I could use it to maintain a user activity feed as well.
The thing is I need the tagging data (stored as key => Sets) in memory and its extremely important to determine results for tagging related operations, where as for the activity feed the data could be deleted on a first in first out basis. Assuming I store X number of activities for every user
Is it possible that I could namespace the redis data sets and have one remain permanently in memory and have the other stay temporarily in the memory. What is the general approach when one uses unrelated data sets that need to have different durations of survival in memory.
Would really appreciate any help on this.

You do not need to define a specific namespace for this. With Redis, you can use the EXPIRE command to set a timeout on a key by key basis.
The general policy regarding key expiration is defined in the configuration file:
# MAXMEMORY POLICY: how Redis will select what to remove when maxmemory
# is reached? You can select among five behavior:
#
# volatile-lru -> remove the key with an expire set using an LRU algorithm
# allkeys-lru -> remove any key accordingly to the LRU algorithm
# volatile-random -> remove a random key with an expire set
# allkeys->random -> remove a random key, any key
# volatile-ttl -> remove the key with the nearest expire time (minor TTL)
# noeviction -> don't expire at all, just return an error on write operations
#
For your purpose, the volatile-lru policy should be set.
You just have to call EXPIRE on the keys you want to be volatile, and let Redis evict them. However please note it is difficult to guarantee that the oldest keys will be evicted first once the timeout has been triggered. More explanations here.
For your specific use case however, I would not use key expiration but rather try to simulate capped collections. If the activity feed for a given user is represented as a list of objects, it is easy to LPUSH the activity objects, and use LTRIM to limit the size of the list. You get FIFO behavior and keep memory consumption under control for free.
UPDATE:
Now, if you really need to isolate data, you have two main possibilities with Redis:
using two distinct databases. Redis database are identified by an integer, and you can have several of them per instance. Use the select command to switch between databases. Databases can be used to isolate data, but not to assign them different properties (like an expiration policy for instance).
using two distinct instances. An empty Redis instance is a very light process. So several of them can be started without any problem. It is actually the best and the more scalable way to isolate data with Redis. Each instance can have its own policies (including eviction policy). The clients should open as many connections as instances.
But again, you do not need to isolate data to implement your eviction policy requirements.

Related

Does EX second impact performance in Redis?

I tried googling something similar , but wasn't habel to find something on the topic
I'm just curious, does it matter how big the number of seconds are set in a key impact performance in redis?
For example:
set mykey "foobarValue" EX 100 VS set mykey "foobarValue" EX 2592000
To answer this question, we need to see how Redis works.
Redis maintains tables of a key, value pair with an expiry time, so each entry can be translated to
<Key: <Value, Expiry> >
There can be other metadata associated with this as well. During GET, SET, DEL, EXPIRE etc operations Redis calculates the hash of the given key(s) and tries to perform the operation. Since it's a hash table, it needs to prob during any operation, while probing it may encounter some expired keys. If you have subscribed for "Keyspace notification" then notification would be sent and the given entry is removed/updated based on the operation being performed. It also does rehashing, during rehashing it might find expired keys as well. Redis also runs background tasks to cleanup expire keys, that means if TTL is too small then more keys would be expired, as this process is random, so more event would be generated.
https://github.com/antirez/redis/blob/a92921da135e38eedd89138e15fe9fd1ffdd9b48/src/expire.c#L98
It does have a small performance issue when TTL is small since it needs to free the memory and fix some pointers. But it can so happen that you're running out of memory since expired keys are also present in the database. Similarly, if you use higher expiry time then the given key would present in the system for a longer time, that can create memory issue.
Setting smaller TTL has also more cache miss for the client application, so client will have performance issues as well.

Redis Cache throws OOM Error with volitile-lru

For debugging we have set Redis to volitile-lru and maxmemory of 10mb
We are using Redis for HTTP Caching in an Ecommerce shop - when there are parallel Requests on a Page the error:
OOM command not allowed when used memory > 'maxmemory'
appears. Shouldn't this be avoided by setting the maxmemory-policy to volitile-lru ? Is redis not fast enought to set the memory free and set the new one (each request has about 200-600kb)
From the docs:
volatile-lru: evict keys by trying to remove the less recently used (LRU) keys first, but only among keys that have an expire set, in order to make space for the new data added.
It seems like your keys might not have an expiration. If thats the case, you might want to consider using allkeys-lru as your eviction policy.
You can also use INFO stats to see if evicted_keys has a value greater than zero.

Redis cache updating

EDIT2: Clarification: The code ALREADY has refresh cache on miss logic. What I'm trying to do is reducing the number of missed cache hits.
I'm using Redis as a cache for an API. The idea is that when the API receives a call it first checks the cache and if the data isn't in cache the API will fetch it and cache it afterwards for next time.
At the moment the configuration is the following:
maxmemory 50mb
maxmemory-policy allkeys-lru
That is, use at most 50mb memory, keep trying keys in there and when memory is full start by deleting the least recently used keys (lru).
Now I want to introduce a second category of keys. For this second category I'm going to set a certain expiry time. Now I would like to set up a mechanism such that when these keys expiry this mechanism kicks in and refreshes them (and sets new expiry).
How do I do this?
EDIT:
Some progress. It turns out that Redis has a pub/sub messaging system which in particular can dispatch messages on event. One of them is expiring keys, which can be enabled as such:
notify-keyspace-events Ex
I found this code can describes a blocking python process subscribing to Redis' messaging system. It can easily be changed to detect keys expiring and make a call to the API when a key expires, and the API will then refresh the key.
def work(self, item):
requests.get('http://apiurl/?q={param}'.format(param=item['data']))
So this does precisely what I was asking about.
Often, this feels way too dangerous and out of control. I can imagine a bunch of different situations under which this will very quickly fail.
So, what's a better solution?
http://redis.io/topics/notifications
Keyspace notifications allows clients to subscribe to Pub/Sub channels
in order to receive events affecting the Redis data set in some way.
Examples of the events that is possible to receive are the following:
All the keys expiring in the database 0. (e.g)
...
EXPIRE generates an expire event when an expire is set to the key, or
a expired event every time setting an expire results into the key
being deleted (see EXPIRE documentation for more info).
To expire keys, just use Redis' built-in expiry mechanism. You don't need to refresh the cache contents on expiry, the simplest is to do it when the code experiences a cache miss.

Guava Cache: How to access without it counting for the eviction policy?

I have a Guava cache which I would like to expire after X minutes have passed from the last access on a key. However, I also periodically do an action on all the current key-vals (much more frequently than the X minutes), and I wouldn't like this to count as an access to the key-value pair, because then the keys will never expire.
Is there some way to read the value of the keys without this influencing the internal state of the cache? ie cache._secretvalues.get(key) where I could conceivably subclass Cache to StealthCache and do getStealth(key)? I know relying on internal stuff is non-ideal, just wondering if it's possible at all. I think when I do cache.asMap.get() it still counts as an access internally.
From the official Guava tutorials:
Access time is reset by all cache read and write operations (including
Cache.asMap().get(Object) and Cache.asMap().put(K, V)), but not by
containsKey(Object), nor by operations on the collection-views of
Cache.asMap(). So, for example, iterating through cache.entrySet()
does not reset access time for the entries you retrieve.
So, what I would have to do is iterate through the entrySet instead to do my stealth operations.

how to handle session expire basing redis?

I want to implement a session store based on Redis. I would like to put session data into Redis. But I don't know how to handle session-expire. I can loop through all the redis keys (sessionid) and evaluate the last access time and max idle time, thus I need to load all the keys into the client, and there may be 1000m session keys and may lead to very poor I/O performances.
I want to let Redis manage the expire, but there are no listener or callback when the key expire, so it is impossible to trigger HttpSessionListener. Any advice?
So you need your application to be notified when a session expires in Redis.
While Redis does not support this feature, there are a number of tricks you can use to implement it.
Update: From version 2.8.0, Redis does support this http://redis.io/topics/notifications
First, people are thinking about it: this is still under discussion, but it might be added to a future version of Redis. See the following issues:
https://github.com/antirez/redis/issues/83
https://github.com/antirez/redis/issues/594
Now, here are some solutions you can use with the current Redis versions.
Solution 1: patching Redis
Actually, adding a simple notification when Redis performs key expiration is not that hard. It can be implemented by adding 10 lines to the db.c file of Redis source code. Here is an example:
https://gist.github.com/3258233
This short patch posts a key to the #expired list if the key has expired and starts with a '#' character (arbitrary choice). It can easily be adapted to your needs.
It is then trivial to use the EXPIRE or SETEX commands to set an expiration time for your session objects, and write a small daemon which loops on BRPOP to dequeue from the "#expired" list, and propagate the notification in your application.
An important point is to understand how the expiration mechanism works in Redis. There are actually two different paths for expiration, both active at the same time:
Lazy (passive) mechanism. The expiration may occur each time a key is accessed.
Active mechanism. An internal job regularly (randomly) samples a number of keys with expiration set, trying to find the ones to expire.
Note that the above patch works fine with both paths.
The consequence is Redis expiration time is not accurate. If all the keys have expiration, but only one is about to be expired, and it is not accessed, the active expiration job may take several minutes to find the key and expired it. If you need some accuracy in the notification, this is not the way to go.
Solution 2: simulating expiration with zsets
The idea here is to not rely on the Redis key expiration mechanism, but simulate it by using an additional index plus a polling daemon. It can work with an unmodified Redis 2.6 version.
Each time a session is added to Redis, you can run:
MULTI
SET <session id> <session content>
ZADD to_be_expired <current timestamp + session timeout> <session id>
EXEC
The to_be_expired sorted set is just an efficient way to access the first keys that should be expired. A daemon can poll on to_be_expired using the following Lua server-side script:
local res = redis.call('ZRANGEBYSCORE',KEYS[1], 0, ARGV[1], 'LIMIT', 0, 10 )
if #res > 0 then
redis.call( 'ZREMRANGEBYRANK', KEYS[1], 0, #res-1 )
return res
else
return false
end
The command to launch the script would be:
EVAL <script> 1 to_be_expired <current timestamp>
The daemon will get at most 10 items. For each of them, it has to use the DEL command to remove the sessions, and notify the application. If one item was actually processed (i.e. the return of the Lua script is not empty), the daemon should loop immediately, otherwise a 1 second wait state can be introduced.
Thanks to the Lua script, it is possible to launch several polling daemons in parallel (the script guarantees that a given session will only be processed once, since the keys are removed from to_be_expired by the Lua script itself).
Solution 3: use an external distributed timer
Another solution is to rely on an external distributed timer. The beanstalk lightweight queuing system is a good possibility for this
Each time a session is added in the system, the application posts the session ID to a beanstalk queue with a delay corresponding to the session time out. A daemon is listening to the queue. When it can dequeue an item, it means a session has expired. It just has to clean the session in Redis, and notify the application.

Resources