how to handle session expire basing redis? - session

I want to implement a session store based on Redis. I would like to put session data into Redis. But I don't know how to handle session-expire. I can loop through all the redis keys (sessionid) and evaluate the last access time and max idle time, thus I need to load all the keys into the client, and there may be 1000m session keys and may lead to very poor I/O performances.
I want to let Redis manage the expire, but there are no listener or callback when the key expire, so it is impossible to trigger HttpSessionListener. Any advice?

So you need your application to be notified when a session expires in Redis.
While Redis does not support this feature, there are a number of tricks you can use to implement it.
Update: From version 2.8.0, Redis does support this http://redis.io/topics/notifications
First, people are thinking about it: this is still under discussion, but it might be added to a future version of Redis. See the following issues:
https://github.com/antirez/redis/issues/83
https://github.com/antirez/redis/issues/594
Now, here are some solutions you can use with the current Redis versions.
Solution 1: patching Redis
Actually, adding a simple notification when Redis performs key expiration is not that hard. It can be implemented by adding 10 lines to the db.c file of Redis source code. Here is an example:
https://gist.github.com/3258233
This short patch posts a key to the #expired list if the key has expired and starts with a '#' character (arbitrary choice). It can easily be adapted to your needs.
It is then trivial to use the EXPIRE or SETEX commands to set an expiration time for your session objects, and write a small daemon which loops on BRPOP to dequeue from the "#expired" list, and propagate the notification in your application.
An important point is to understand how the expiration mechanism works in Redis. There are actually two different paths for expiration, both active at the same time:
Lazy (passive) mechanism. The expiration may occur each time a key is accessed.
Active mechanism. An internal job regularly (randomly) samples a number of keys with expiration set, trying to find the ones to expire.
Note that the above patch works fine with both paths.
The consequence is Redis expiration time is not accurate. If all the keys have expiration, but only one is about to be expired, and it is not accessed, the active expiration job may take several minutes to find the key and expired it. If you need some accuracy in the notification, this is not the way to go.
Solution 2: simulating expiration with zsets
The idea here is to not rely on the Redis key expiration mechanism, but simulate it by using an additional index plus a polling daemon. It can work with an unmodified Redis 2.6 version.
Each time a session is added to Redis, you can run:
MULTI
SET <session id> <session content>
ZADD to_be_expired <current timestamp + session timeout> <session id>
EXEC
The to_be_expired sorted set is just an efficient way to access the first keys that should be expired. A daemon can poll on to_be_expired using the following Lua server-side script:
local res = redis.call('ZRANGEBYSCORE',KEYS[1], 0, ARGV[1], 'LIMIT', 0, 10 )
if #res > 0 then
redis.call( 'ZREMRANGEBYRANK', KEYS[1], 0, #res-1 )
return res
else
return false
end
The command to launch the script would be:
EVAL <script> 1 to_be_expired <current timestamp>
The daemon will get at most 10 items. For each of them, it has to use the DEL command to remove the sessions, and notify the application. If one item was actually processed (i.e. the return of the Lua script is not empty), the daemon should loop immediately, otherwise a 1 second wait state can be introduced.
Thanks to the Lua script, it is possible to launch several polling daemons in parallel (the script guarantees that a given session will only be processed once, since the keys are removed from to_be_expired by the Lua script itself).
Solution 3: use an external distributed timer
Another solution is to rely on an external distributed timer. The beanstalk lightweight queuing system is a good possibility for this
Each time a session is added in the system, the application posts the session ID to a beanstalk queue with a delay corresponding to the session time out. A daemon is listening to the queue. When it can dequeue an item, it means a session has expired. It just has to clean the session in Redis, and notify the application.

Related

Does EX second impact performance in Redis?

I tried googling something similar , but wasn't habel to find something on the topic
I'm just curious, does it matter how big the number of seconds are set in a key impact performance in redis?
For example:
set mykey "foobarValue" EX 100 VS set mykey "foobarValue" EX 2592000
To answer this question, we need to see how Redis works.
Redis maintains tables of a key, value pair with an expiry time, so each entry can be translated to
<Key: <Value, Expiry> >
There can be other metadata associated with this as well. During GET, SET, DEL, EXPIRE etc operations Redis calculates the hash of the given key(s) and tries to perform the operation. Since it's a hash table, it needs to prob during any operation, while probing it may encounter some expired keys. If you have subscribed for "Keyspace notification" then notification would be sent and the given entry is removed/updated based on the operation being performed. It also does rehashing, during rehashing it might find expired keys as well. Redis also runs background tasks to cleanup expire keys, that means if TTL is too small then more keys would be expired, as this process is random, so more event would be generated.
https://github.com/antirez/redis/blob/a92921da135e38eedd89138e15fe9fd1ffdd9b48/src/expire.c#L98
It does have a small performance issue when TTL is small since it needs to free the memory and fix some pointers. But it can so happen that you're running out of memory since expired keys are also present in the database. Similarly, if you use higher expiry time then the given key would present in the system for a longer time, that can create memory issue.
Setting smaller TTL has also more cache miss for the client application, so client will have performance issues as well.

KStreams: implementing session window with pocessor API

I need to implement a logic similar to session windows using processor API in order to have a full control over state store. Since processor API doesn't provide windowing abstraction, this needs to be done manually. However, I fail to find the source code for KStreams session window logic, to get some initial ideas (specifically regarding session timeouts).
I was expecting to use punctuate method, but it's a per processor timer rather than per key timer. Additionally SessionStore<K, AGG> doesn't provide an API to traverse the database for all keys.
[UPDATE]
As an example, assume processor instance is processing K1 and stream time is incremented which causes the session for K2 to timeout. K2 may or may not exist at all. How do you know that there exists a specific key (like K2 when stream time is incremented (while processing a different key)? In other words when stream time is incremented, how do you figure out which windows are expired (because you don't know those keys exists)?
This is the DSL code: https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/internals/KStreamSessionWindowAggregate.java -- hope it helps.
It's unclear what your question is though -- it's mostly statements. So let me try to give some general answer.
In the DSL, sessions are close based on "stream time" progress. Only relying on the input data makes the operation deterministic. Using wall-clock time would introduce non-determinism. Hence, using a Punctuation is not necessary in the DSL implementation.
Additionally SessionStore<K, AGG> doesn't provide an API to traverse the database for all keys.
Sessions in the DSL are based on keys and thus it's sufficient to scan the store on a per-key basis over a time range (as done via findSessions(...)).
Update:
In the DSL, each time a session window is updated, as corresponding update event is sent downstream immediately. Hence, the DSL implementation does not wait for "stream time" to advance any further but publishes the current (potentially intermediate) result right away.
To obey the grace period, the record timestamp is compared to "stream time" and if the corresponding session window is already closed, the record is skipped (cf. https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/internals/KStreamSessionWindowAggregate.java#L146). I.e., closing a window is just a logical step (not an actually operation); the session will still be stored and if a window is closed no additional event needs to be sent downstream because the final result was sent downstream in the last update to the window already.
Retention time itself must not be handled by the Processor implementation because it's a built-in feature of the SessionStore: internally, the session store maintains so-called "segments" that store sessions for a certain time period. Each time a put() is done, the store checks if old segments can be dropped (based on the timestamp provided by put()). I.e., old sessions are deleted lazily and as bulk deletes (i.e., all session of the whole segment will be deleted at once) as it's more efficient than individual deletes.

Redis cache updating

EDIT2: Clarification: The code ALREADY has refresh cache on miss logic. What I'm trying to do is reducing the number of missed cache hits.
I'm using Redis as a cache for an API. The idea is that when the API receives a call it first checks the cache and if the data isn't in cache the API will fetch it and cache it afterwards for next time.
At the moment the configuration is the following:
maxmemory 50mb
maxmemory-policy allkeys-lru
That is, use at most 50mb memory, keep trying keys in there and when memory is full start by deleting the least recently used keys (lru).
Now I want to introduce a second category of keys. For this second category I'm going to set a certain expiry time. Now I would like to set up a mechanism such that when these keys expiry this mechanism kicks in and refreshes them (and sets new expiry).
How do I do this?
EDIT:
Some progress. It turns out that Redis has a pub/sub messaging system which in particular can dispatch messages on event. One of them is expiring keys, which can be enabled as such:
notify-keyspace-events Ex
I found this code can describes a blocking python process subscribing to Redis' messaging system. It can easily be changed to detect keys expiring and make a call to the API when a key expires, and the API will then refresh the key.
def work(self, item):
requests.get('http://apiurl/?q={param}'.format(param=item['data']))
So this does precisely what I was asking about.
Often, this feels way too dangerous and out of control. I can imagine a bunch of different situations under which this will very quickly fail.
So, what's a better solution?
http://redis.io/topics/notifications
Keyspace notifications allows clients to subscribe to Pub/Sub channels
in order to receive events affecting the Redis data set in some way.
Examples of the events that is possible to receive are the following:
All the keys expiring in the database 0. (e.g)
...
EXPIRE generates an expire event when an expire is set to the key, or
a expired event every time setting an expire results into the key
being deleted (see EXPIRE documentation for more info).
To expire keys, just use Redis' built-in expiry mechanism. You don't need to refresh the cache contents on expiry, the simplest is to do it when the code experiences a cache miss.

Redis as cache - reset expiry

I am using redis as a cache and would like to expire data in redis that are not actively used. Currently, setting expiry for an object deletes an object after the expiry time has elapsed. However, I would like to retain the object in redis if it is read atleast once before the object expires.
One way I see is to store a separate expiry_key for every object and set the expiry to the expiry_key instead of the original object. Subscribe to del notification on the expiry_key and when a del notification is received, check if the object is read atleast once (via a separately maintained access log) during the expiry interval. If the object is not read, execute a del command on the original object. If it is read, recreate the expiry_key with the expiry interval.
This implementation requires additional systems to manage expiry and would prefer to do it locally with redis.
Are there better solutions to solve this?
Resetting expiry for the object for every read will increase the number of writes to redis and hence this is not a choice.
Note the redis cache refresh is managed asynchronously via a change notification system.
You could just set the expiry key again after each read (setting a TTL on a key is O(1)).
It maybe make sense for your system to do this in a transaction:
MULTI
GET mykey
EXPIRE mykey 10
EXEC
You could also pipeline the commands.
This pattern is also described in the official documentation.
Refer to section "Configuring Redis as a cache" in http://redis.io/topics/config
We can set maxmemory-policy to allkeys-lru to clear inactive content from redis. This would work for the usecase I have stated.
Another way is do define a notification on the key , and then reset it's expiration
see here

How to I set up a lock that will automatically time out if it does not get a keep alive signal?

I have a certain resouce I want to limit access to. Basically, I am using a session level lock. However, it is getting to be a pain writing JavaScript that covers every possible way a window can close.
Once the user leaves that page I would like to unlock the resouce.
My basic idea is to use some sort of server side timeout, to unlock the resouce. Basically, if I fail to unlock the resource, I want a timer to kick in and unlock the resouce.
For example, after 30 seconds with now update from the clientside, unlock the resouce.
My basic question, is what sort of side trick can I use to do this? It is my understanding, that I can't just create a thread in JSF, because it would be unmanaged.
I am sure other people do this kind of thing, what is the correct thing to use?
Thanks,
Grae
As BalusC right fully asked, the big question is at what level of granularity would you like to do this locking? Per logged-in user, for all users, or perhaps you could get away with locking per request?
Or, and this will be a tougher one, is the idea that a single page request grabs the lock and then that specific page is intended to keep the lock between requests? E.g. as a kind of reservation. I'm browsing a hotel page, and when I merely look at a room I have made an implicit reservation in the system for that room so it can't happen that somebody else reserves the room for real while I'm looking at it?
In the latter case, maybe the following scheme would work:
In application scope, define a global concurrent map.
Keys of the map represent the resources you want to protect.
Values of the map are a custom structure which hold a read write lock (e.g. ReentrantReadWriteLock), a token and a timestamp.
In application scope, there also is a single global lock (e.g. ReentrantLock)
Code in a request first grabs the global lock, and quickly checks if the entry in the map is there.
If the entry is there it is taken, otherwise it's created. Creation time should be very short. The global lock is quickly released.
If the entry was new, it's locked via its write lock and a new token and timestamp are created.
If the entry was not new, it's locked via its read lock
if the code has the same token, it can go ahead and access the protected resource, otherwise it checks the timestamp.
If the timestamp has expired, it tries to grab the write lock.
The write lock has a time-out. When the time-out occurs give up and communicate something to the client. Otherwise a new token and timestamp are created.
This just the general idea. In a Java EE application that I have build I have used something similar (though not exactly the same) and it worked quite well.
Alternatively you could use a quartz job anyway that periodically removed the stale entries. Yet another alternative for that is replacing the global concurrent map with e.g. a JBoss Cache or Infinispan instance. These allow you to define an eviction policy for their entries, which saves you from having to code this yourself. If you have never used those caches though, learning how to set them up and configuring them correctly can be more trouble than just building a simple quartz job yourself.

Resources