Trying to build a data set of two cache tables (which are currently stored in SQL Server) - one is the actual cache table (CacheTBL); the other is the staging table (CacheTBL_Staging).
The table structure has two columns - "key", "value"
So I'm wondering how to implement this in Redis as I'm a total noob to this NoSQL stuff. Should I use a SET or LIST? Or something else?
You need to decide whether you want separate REDIS keys for all entries using SET and GET, or put them into hashes with HSET and HGET. If you use the first approach, your keys should include a prefix to distinguish between main and staging. If you use hashes, this is not necessary, because the hash name can also be used to distinguish these. You probably also need to decide how you want to check for cache validity, and what your cache flushing strategy should be. This normally requires some additional data structures in REDIS.


There is sorted set functionality in tarantool?

I'm starting work on project which will require to do many work with sorted sets. I need to keep some sets sorted and do CRUDs as fast as possible, there is any tarantool functionality that allows to insert data to sorted set like redis ZADD function? Or i have to sort data on my own (using C or lua scripts) or maybe sorted selects from tarantool is fast enough? Please give me some opinions or advices
In Tarantool, TREE index automatically sorts your data. Create a simple space with TREE primary key on the first field. You can store any json data in the second or third, fourth, ... field, or you can then format the space to reflect your schema and set values will conform to the schema, just like in a relational database.

Is it bad practice to store JSON members with Redis GEOADD?

My application should handle a lot of entities (100.000 or more) with location and needs to display them only within a given radius. I basically store everything in SQL but using Redis for caching and optimization (mainly GEORADIUS).
I am adding the entities like the following example (not exactly this, I use Laravel framework with the built-in Redis facade but it does the same as here in the background):
GEOADD k 19.059982 47.494338 {\"id\":1,\"name\":\"Foo\",\"address\":\"Budapest, Astoria\",\"lat\":47.494338,\"lon\":19.059982}
Is it bad practice? Or will it make a negative impact on performance? Should I store only ID-s as member and make a following query to get the corresponding entities?
This is a matter of the requirements. There's nothing wrong with storing the raw data as members as long as it is unique (and it unique given the "id" field). In fact, this is both simple and performant as all data is returned with a single query (assuming that's what actually needed).
That said, there are at least two considerations for storing the data outside the Geoset, and just "referencing" it by having members reflect some form of their key names:
A single data structure, such as a Geoset, is limited by the resources of a single Redis server. Storing a lot of data and members can require more memory than a single server can provide, which would limit the scalability of this approach.
Unless each entry's data is small, it is unlikely that all query types would require all data returned. In such cases, keeping the raw data in the Geoset generates a lot of wasted bandwidth and ultimately degrades performance.
When data needs to be updated, it can become too expensive to try and update (i.e. ZDEL and then GEOADD) small parts of it. Having everything outside, perhaps in a Hash (or maybe something like RedisJSON) makes more sense then.

Downside of many caches in spring

Due to the limitation of not being able to evict entries based on a partial key, I am thinking of a workaround using the cache name as my partial key and evicting all (there would only be one) entries in the cache. For example, let's say there are 2 key-value pairs like so:
"123#name1" -> value1,
"124#name2" -> value2
Ideally, at the time of eviction, I would like to remove all keys that contain the string "123". However, as this is not supported, the workaround I'm thinking of is to have the following:
"123" cache: "name1" -> value1
"124" cache: "name2" -> value2
Then at eviction, I would simply specify to remove all keys in "123" cache
The downside of this of course is that there would be a lot of different caches. Is there any performance penalty to this?
From reading this, it seems Redis at least only uses the cache name as a prefix. So it is not creating multiple separate caches underneath it. But I would like to verify my understanding.
I am also looking to use Redis as my underlying cache provider if that helps.
You can use few approaches to overcome this :
Use grouped data structures like sets, sorted sets and hashes : Each one of them supports really high number of member elements. So you can use them to store your cache items,and do the relevant lookups. However, do go through the performance difference ( would be very small ) on this kind of lookup compared to a direct key-value lookup.
Once you want to evict a group of cache keys which are of similar type, you just remove that data structure key from redis.
Use redis database numbers : You would need to edit redis.conf to increase maximum number of redis database numbers possible. Redis databases are just numbers which provide namespacing in which your key-values can lie. To group similar items, you would put them in the same database number, and empty the database with a single command whenever you want to flush that group of keys.
The caveat here is that, though you would be able to use same redis connection, you would have to switch databases through redis SELECT command

Best-performing method for associating arbitrary key/value pairs with a table row in a Postgres DB?

I have an otherwise perfectly relational data schema in place for my Postgres 8.4 DB, but I need the ability to associate arbitrary key/value pairs with several of my tables, with the assigned keys varying by row. Key/value pairs are user-generated, so I have no way of predicting them ahead of time or wrangling orderly schema changes.
I have the following requirements:
Key/value pairs will be read often, written occasionally. Reads must be reasonably fast.
No (present) need to query off of the keys or values. (But it might come in handy some day.)
I see the following possible solutions:
The Entity-Attribute-Value pattern/antipattern. Annoying, but the annoyance would be generally offset by my ORM.
Storing key/value pairs as serialized JSON data on a text column. A simple solution, and again the ORM comes in handy, but I can kiss my future self's need for queries good-bye.
Storing key/value pairs in some other NoSQL db--probably a key/value or document store. ORM is no help here. I'll have to manage the separate queries (and looming data integrity issues?) myself.
I'm concerned about query performance, as I hope to have a lot of these some day. I'm also concerned about programmer performance, as I have to build, maintain, and use the darned thing. Is there an obvious best approach here? Or something I've missed?
That's precisely what the hstore datatype is for in PostgreSQL.
It's really fast (you can index it) and quite easy to handle. The only drawback is that you can only store character data, but you'd have that problem with the other solutions as well.
Indexes support "exists" operator, so you can query quite quickly for rows where a certain key is present, or for rows where a specific attribute has a specific value.
And with 9.0 it got even better because some size restrictions were lifted.
hstore is generally good solution for that, but personally I prefer to use plain key:value tables. One table with definitions, other table with values and relation to bind values to definition, and relation to bind values to particular record in other table.
Why I'm against hstore? Because it's like a registry pattern. Often mentioned as example of anti pattern. You can put anything there, it's hard to easy validate if it's still needed, when loading a whole row (in ORM especially), the whole hstore is loaded which can have much junk and very little sense. Not mentioning that there is need to convert hstore data type into your language type and convert back again when saved. So you get some overhead of type conversion.
So actually I'm trying to convert all hstores in company I'm working for into simple key:value tables. It's not that hard task though, because structures kept here in hstore are huge (or at least big), and reading/writing an object crates huge overhead of function calls. Thus making a simple task like that "select * from base_product where id = 1;" is making a server sweat and hits performance badly. Want to point that performance issue is not because db, but because python has to convert several times results received from postgres. While key:value is not requiring such conversion.
As you do not control data then do not try to overcomplicate this.
create table sometable_attributes (
sometable_id int not null references sometable(sometable_id),
attribute_key varchar(50) not null check (length(attribute_key>0)),
attribute_value varchar(5000) not null,
primary_key(sometable_id, attribute_key)
This is like EAV, but without attribute_keys table, which has no added value if you do not control what will be there.
For speed you should periodically do "cluster sometable_attributes using sometable_attributes_idx", so all attributes for one row will be physically close.

When do we really need a key/value database instead of a key/value cache server?

Most of the time,we just get the result from database,and then save it in cache server,with an expiration time.
When do we need to persistent that key/value pair,what's the significant benifit to do so?
If you need to persist the data, then you would want a key/value database. In particular, as part of the NoSQL movement, many people have suggested replacing traditional SQL databases with Key/Value pair databases - but ultimately, the choice remains with you which paradigm is a better fit for your application.
Use a key/value database when you are using a key/value cache and you don't need a sql database.
When you use memcached/mysql or similar, you need to write two sets of data access code - one for getting objects from the cache, and another from the database. If the cache is your database, you only need the one method, and it is usually simpler code.
You do lose some functionality by not using SQL, but in a lot of cases you don't need it. Only the worst applications actually leave constraint checking to the database. Ad-hoc queries become impractical at scale. The occasional lost or inconsistent record simply doesn't matter if you are working with tweets rather than financial data. How do you justify the added complexity of using a SQL database?
