I am attempting to configure Redis as cache layer in a Java application. Specifically, the intended use of Redis will be to maintain session state. Each user in the application would be represented by:
a UUID corresponding to the session ID
a userId value
last access time
and maybe a few other small fields
I am confused about how to use Redis' hash. The interface for Jedis is:
Jedis#hset(byte[] key, byte[] field, byte[] value)
That is, the Redis hash has a key, which in turn points to another map of fields and values.
What design should I use to:
Efficiently store each user's UUID in the hash
Also take advantage of Redis' ability to expire entires
To the second point above, if I wanted to expire an entry, it would have to be done at the key, and not field, level. But, this would then imply that each UUID would have to be a separate key in the hash, and it is not clear whether that would be good Redis design.
Using Redis hashes for session state is very common. The standard approach is to use the session ID as the key and to use the hash fields for the rest of the session state. This design has the following desirable characteristics:
You can fetch the session state in O(1) time.
You can set an expiration time for the session.
You can store arbitrary amounts of information in the session.
I believe that meets your requirements.
Your use of the phrases "the Redis hash has a key" and "key in the hash" make me think you're misunderstanding how hashes work. The key is the name of the hash, if you will, not a member of it. The HSET signature is specifying which hash you want to modify (the key), and what field and value to set in it.
Here's an example (using Redis commands) of what creating a session might look like:
HMSET session:123 userId 5 last_login 2019-02-13 ...
EXPIRE session:123 2592000
Then you can get session data with:
HGET session:123 last_login
Or set it with:
HSET session:123 last_login 2019-02-18
Related
I have an object Company and multiple methods that can be used to get this object. Ex. GetById, GetByEmail, GetByName.
What I'd like is to cache those method calls with a possibility to invalidate all cache entries related to one object at once.
For example, a company is cached. There are 3 entries in cache with following keys:
Company:GetById:123
Company:GetByEmail:foo#bar.com
Company:GetByName:Acme
All three keys are related to one company.
Now let's assume that company has changed. Then I would like to invalidate all keys related to this company. I didn't find any built-in solution for that purpose.
Tagging cache entries with some common id (companyId for example) and then removing all entries by it would be great, but this feature doesn't seem to exist.
So to answer your question directly, You'd probably want to maintain all the keys related to your company in a list, scan through that list, and delete all the associated keys with a DEL command.
So something like:
LPUSH companies-keys:Acme Company:GetById:123 Company:GetByEmail:foo#bar.com Company:GetByName:Acme
Then
RPOP companies-keys:Acme
and for each entry you get out of the list:
UNLINK keyname
To answer it not so directly, you may want to consider using a Hash rather than just keys, that way you can just modify one of the fields in the hash rather than having to invalidate all the keys associated with it.
So you could create it with:
HSET companies:123 id 123 email foo#bar.com name acme
Then you could update a particular entry in the company record with HMSET:
HMSET companies:123 email bar#foo.com
Since it sounds like being able to look up a given record by different fields is really important to your use case - you may also want to consider adding RediSearch and indexing the fields you want to be able to search on different fields for the set of fields listed above, and index of:
FT.CREATE companies-idx ON HASH PREFIX 1 companies: SCHEMA id TAG email TEXT name TEXT
Might be appropriate - then you could look up a company with a given email like:
FT.SEARCH companies-idx "#email: foo"
I have a User struct with ID and LoginName fields and I want this struct to be accessible by either of these fields with single call to the DB. I know BoltDB is not supposed to handle arbitrary field indexing etc. (unlike SQL) but this case is a little different as I happen to know in advance the additional field to b used as index.
So is there some kind of secondary key or multiple key indexing? or maybe some strategy that I fail to see? If not then I'll just implement it with two calls, I just prefer a "cleaner" solution...
Thanks!
No, it's not there. BoltDB is a lot like Go. Clean and simple. And building a layer on top is easy. BoltDB even allows update transactions to be trivially implemented so two more more buckets can be updated, or not, atomically. So creating an update transaction that keeps two or more buckets in sync is easy. But it sounds like you know that and just wanted to check that you aren't missing something.
There is no secondary key indexing in BoltDB, but you can implement it.
You can store ID to LoginName mapping in another bucket, and it will be technically the "secondary key" for your struct. That is, first obtain the primary key value from the secondary key, and then the User struct.
If most of your calls are on LoginName key, use LoginName to ID mapping and store User struct under LoginName key and vice versa.
Be careful: you have to maintain consistency by your own, remember it.
I am a newbie of the CI framework. I am trying to understand what session id is for. Here is the description from CI doc:
The user's unique Session ID (this is a statistically random string with very strong entropy, hashed with MD5 for portability, and regenerated (by default) every five minutes)
I am just wondering under what occasions I should use this.
Thanks
The session ID in CodeIgniter is essentially either;
if using a database for session storage: a random hash (string) that references the session data in the database so CodeIgniter knows what data belongs to who
if not using a database; an encrypted set of data that CodeIgniter uses (that you can leverage as well) to track the user (eg. if logged in, a user ID)
What might be a good way to introduce BIGINT into the ASP.NET Membership functionality to reference users uniquely and to use that BIGINT field as a tenant_id? It would be perfect to keep the existing functionality generating UserIds in the form of GUIDs and not to implement a membership provider from ground zero. Since application will be running on multiple servers, the BIGINT tenant_id must be unique and it should not depend on some central authority generating these IDs. It will be easy to use these tenant_id with a SPLIT AT command down the road which will allow bucketing users into new federated members. Any thoughts on this?
Thanks
You can use bigint. But you may have to modify all stored procedures that rely on user ID. Making ID global unique is usually not a problem. As long as the ID is the primary key, database will force it to be unique. Otherwise you will get errors when inserting new data (in that case, you can modify ID and retry).
So the most important difference is you may need to modify stored procedures. You have a choice here. If you use GUID, you don't need to do anything. But it may be difficult to predict how to split the federation to balance queries. As pointed out in another thread (http://stackoverflow.com/questions/10885768/sql-azure-split-on-uniqueidentifier-guid/10890552#comment14211028_10890552), you can sort existing data at the mid point. But you don't know future data will be inserted in which federation. There's a potential risk that federations will become unbalanced, and you may need to merge and split them at a regular interval to keep them in shape.
By using bigint, you have better control over the key. For example, you have two federations. The first has ID from 1 to 10000, and the second has ID from 10001 to 20000. When creating a new user, you first check how many records are in each federation. Suppose federation 1 has 500 records and federation 2 has 1000 records, to balance the load, you choose to insert to federation 1, so you choose an ID between 1 and 10000. But using bigint, you may need to do more work to modify stored procedures.
I am making a site that each account will have an ID.
But, I didn't want to make it incrementable, meaning:
id=1
id=2
...
id=1000
What I want is to have random IDs:
id=2355
id=5647734
id=23532
...
(The reason is to avoid robots to check all accounts profiles by just incrementing a ID in URL - and maybe other reason, but that is not the question)
But, I am worried about performance on registration.
It will be something like this:
while (RANDOM_ID is not taken): generate new RANDOM_ID
On generating a new ID for the new account, I will query database (MySQL) to check if the ID exists, for each generation.
Is there any better solution for this?
Is there any disadvantage of using random IDs?
Thanks in advance.
There are many, many reasons not to do this:
Your solution, as written, is not transactionally-safe; two transactions at the same time could both generate the same "random" ID.
If you serialize the transaction in order to make it safe, you will slaughter performance because the query will keep every single collision row locked until it finds a spare ID.
Using a random ID as the primary key will fragment the hell out of your clustered index. This is bad enough with uuids - the whole point of an auto-generated identity column is so you can generate a safe sequence out of it.
Why not use a regular primary key, but just don't use that in any of your URLs? Generate a secondary non-sequential ID along with it - such as a uuid - index it, and use this column in any public-facing segments of your application instead of the primary key if you are really worried about security.
You can use UUIDs. It's a unique identifier generated based partly on timestamp. It's almost certainly guaranteed to be unique so you don't have to do a query to check.
i do not know what language you're using, but there should be library or sample code for this for most languages.
Yes you can use UUID but keep your auto_increment field. Just add a new field and set it so something like: md5(microtime(true).rand()) or whatever other method you like and use that unike key along the site to make the links instead to expose the primary key in urls.