Count redis keys without fetching them in Ruby - ruby

I'm keeping a list of online users in Redis with one key corresponding to one user. Keys are set to time out in 15 minutes, so all I have to do to see how many users have roughly been active in the past 15 minutes, I can do:
redisCli.keys('user:*').count
The problem is as the number of keys grows, the time it takes to fetch all the keys before counting them is increasing noticeably. Is there a way to count the keys without actually having to fetch all of them first?

There is an alternative to directly indexing keys in a Set or Sorted Set, which is to use the new SCAN command. It depends on the use case, memory / speed tradeoff, and required precision of the count.
Another alternative is that you use Redis HyperLogLogs, see PFADD and PFCOUNT.

Redis does not have an API for only counting keys with a specific pattern, so it is also not available in the ruby client.
What I can suggest is to have another data-structure to read to number of users from.
For instance, you can use redis's SortedSet, where you can keep each user with the timestamp of its last TTL set as the score, then you can call zcount to get the current number of active users:
redisCli.zcount('active_users', 15.minutes.ago.to_i, Time.now.to_i)
From time to time you will need to clean up the old values by:
redisCli.zremrangebyscore 'active_users', 0, 15.minutes.ago.to_i

Related

Is it better to use txt file to get the current counter value instead of database?

I am working on a website in laravel, wherein I am loading a current counter value from the database. And then the user clicks on the button to increase the score.
But as the website has around 4000 concurrent users at any given time, the Database connection is taking its toll on the server and resulting in timeouts.
If I load the current score from the txt file and then write it back to the same file, will it be better?
Or should I use an Application variable to store the score?
I have tried using the cache, but it doesn't pull the latest value. Database optimization is also not working due to the number of users.
I am looking at best way to show and increment counter without database usage.
A database would do a better job. A NoSQL database is perfect for your use case. You can use Redis, it stores the data in-memory (RAM), which means read and write operations will be much faster than other database that operates in secondary disk (Hard Drive).
Redis itself supports data structure to increment values, using INCR command. INCR increments the number stored at key by one. If the key does not exist, it is set to 0 before performing the operation.
For example your key that holds the value is my_counter. You can play around with redis-cli like so.
redis> SET my_counter "10"
"OK"
redis> INCR my_counter
(integer) 11
redis> GET my_counter
"11"
Fortunately, there is a Redis client for Laravel. You can have a read here:
https://laravel.com/docs/5.8/redis
Good luck :)
Edit 1:
If a high amount of user is causing the server to slow down, you have other server and architectural options that can be set alongside a new database. Such as horizontal and vertical scaling.
References:
https://github.com/phpredis/phpredis

How to get values over n minutes old on redis?

I have redis datastore with data stored using alphanumeric non-date keys. How might I get the values that have been stored longer than a certain time period?
Store the name of every key you add in a Sorted Set, with the score being the creation timestamp. To retrieve ranges, such as keys created before x time, refer to ZRANGE.

Is there a limit in number of keys in a Redis SET UNION operation?

I have a scenario where I am dumping huge amount of data from Google Big Query to Redis SET data structure to get better response time. I need SET UNION operation to be done over millions of keys. I have tested with few thousands of keys and working fine. The question is, there is any limit on number of keys can be supplied to a SUNION command at a time? Is it really SUNION Key1 Key2 Key3 ..... KeyN?
Consider I have enough system capacity.
[...] over millions of keys
There's no statement in Redis' documentation about a limitation on how many keys can be provided in a single sunion command.
BTW, I doubt that doing such operation could be a good idea in Redis. Remember that Redis will get blocked until this operation and, and no other operation will be executed until the sunion ends.
My best advise will be you should do it using many sunionstore commands, and later get all results from many sets like if the whole sets would be pages of the result of sunion millions of keys.
sunionstore key:pages:1 key1 keyN
...and later you would use some iterator in your application layer to iterate over all generated pages.

Aerospike: get upsert time without explicitly storing it for records with a TTL

Aerospike is blazingly fast and reliable, but expensive. The cost, for us, is based on the amount of data stored.
We'd like the ability to query records based on their upsert time. Currently, when we add or update a record, we set a bin to the current epoch time and can run scan queries on this bin.
It just occurred to me that Aerospike knows when to expire a record based on when it was upserted, and since we can query the TTL value from the record metadata via a simple UDF, it might be possible to infer the upsert time for records with a TTL. We're effectively using space to store a value that's already known.
Is it possible to access record creation or expiry time, via UDF, without explicitly storing it?
At this point, Aerospike only stores the void time along with the record (the time when the record expires). So the upsert time is unfortunately not available. Stay tuned, though, as I heard there were some plans to have some new features that may help you. (I am part of Aerospike's OPS/Support team).
void time : This tracks the life of a key in system. This is the time at which key should expire and is used by eviction subsystem.
so ttl is derived from the void time.
As we get ttl from a record, we can only calculate the void time (now + ttl)
Based on what you have, I think you can evaluate the upsert time from ttl only if you add same amount of expiration to all your records, say CONSTANT_EXPIRATION_TIME.
in that case
upsert_time = now - (CONSTANT_EXPIRATION_TIME - ttl)
HTH

Redis session with concurrent user licensing

I want to use Redis for session. Users will be stored in Redis with expire which will be updated on each request. I would like to implement concurrent license.
How can I count number of currently stored keys?
I found out that there is KEYS command but it should not be used on production. I also thought about some triggers when key expires but again it's not what I should rely on.
How can I implement concurrent user licensing with Redis?
This is not a great use for EXPIRE or the top level of Redis keys. If you ever want to store anything else in Redis, it will mess up your logic. Also, although you can count the total number of keys with a command like DBSIZE, it may be inaccurate because Redis does not actively expire items. In my impression, Redis is built so that the exact number of keys in the top level should not be important.
For cases where an exact number of keys is important, Redis has some great datastructures you can make use of. In your case, I'd recommend a sorted set where the key is a user_id and the score is the expiration date in Unix time. This would look something like:
ZADD users_set 1453771862 "user1"
ZADD users_set 1453771563 "user2"
ZADD users_set 1453779999 "user3"
Then, any time you need to know how many current users there are, you can just do a ZCOUNT for all expiration times higher than the current time:
ZCOUNT users_set 1453771850
>>> 2
ZADD operations are idempotent, so you can also easily add/update expiration times on users:
ZADD users_set 1453779999 "user2"
ZCOUNT users_set 1453771850
>>> 3
This way you get an exact count of relevant users every time you do a ZCOUNT, and every operation you're doing is a relatively cheap O(log(n)).
Finally, if literally removing/expiring users from the sorted set is important to you, you can do this as a pretty cheap batch job with ZREMRANGEBYRANK on whatever interval you like.

Resources