Is there any embedable key-value store for Ruby? - ruby

I need fast and reliable key-value store for Ruby. Is there anything like it already?
The requirement is for it to run wholly inside the Ruby process, not needing any outside processes.
It might be in-memory with explicit disk flushes.
It needs to have minimal value-for-key retrieval times, write times may be not so good.
The amount of data stored won't be terrible, about few hundred thousand keys, each with ~1kb text value.

It turns out that the best option for me was to use plain Hash along with Marshal to serialize it to disk.
YAML is definitely too slow for that number of objects.
Thanks to #ian-armit for reinforcing my trust in the core Ruby libraries.

You could also try Moneta which allows you to build your own key/value store embedded in a ruby process.

Like DBM? http://www.ruby-doc.org/stdlib-1.9.3/libdoc/dbm/rdoc/DBM.html
(filler for spambot)
The DBM class provides a wrapper to a Unix-style dbm or Database Manager library.
Dbm databases do not have tables or columns; they are simple key-value data stores, like a Ruby Hash except not resident in RAM. Keys and values must be strings.

You could try Oria: https://github.com/intridea/oria
Oria (oh-rye-uh) is an in-memory, Ruby-based, zero-configuration Key-Value Store. It's designed to handle moderate amounts of data quickly and easily without causing deployment issues or server headaches. It uses EventMachine to provide a networked interface to a semi-persistent store and asynchronously writes the in-memory data to YAML files.

Check out PStore. Not sure if it's fast enough though.

Daybreak is a nice new option. Data is stored in a table in memory so Ruby niceties are available (each, filter, map, reduce, etc) and appears to be faster than pstore or dbm.
See this blog post for more info.

There's LevelDB, here's the ruby bindings.

Related

What is Redis ValueOperations?

What is Redis Value operations in Spring boot?
Is it like we can directly store Key-value pair in Redis database without creating the entity and stuff just by using RedisTemplate<String, Object> ?
Also, if we use ValueOperations how will it impact the performance?
When using Redis, you should think about what data format/datatype suits your needs best, similar to what you would do when coding in any general programming language. All those operations, ValueOperations, ListOperations, SetOperations, HashOperations, StreamOperations are the support provided for interacting with the mentioned datatypes. They are provided by the RedisTemplate.
When you are using ValueOperations, you are more or less treating your whole Redis instance as a giant hash map. For example, you can store entries in Redis like current_user = "John Doe". However, you can also do something silly such as keeping a string representation of a huge hashmap against a key, top_users = <huge_string_representing_a_hash_map> when thinking from the perspective of the second case, what if you want to get the value for one key in the mentioned hash map. Then, the task becomes more or less impossible without transferring the whole hash map in RAM. Yet, if you have used Redis Hashes and HashOperations that would have been a more trivial task.
Going back to your question, if you want to store a simple object using ValueOperations. That wouldn't degrade the performance. In contrast, if you are moving huge maps around, you'll utilise a lot of your network bandwidth and RAM capacity.
In summary, choose your Redist data types carefully to suit your needs.
https://redis.io/topics/data-types

My company uses memcache as object just fine, can't see need for redis in caching

I'm learning about redis/memcache and redis is clearly the more popular option. My question is about supported data types. At my company we use the memcashier library which is built in memcached. We store temporary user data when they're making a purchase in memcache. We can easily update this object as things are added to the cart or more info about the user is given. This appears to be the same functionality as a hash in redis. I don't understand how this is only a basic string data type and how it's less powerful than a hash.
If you are using strings, that's fine - but any change involves loading the data to your application, parsing it, modifying it, and serializing it back to Redis/Memcache.
This has two problems: it's slow and non atomic. You can have two servers modifying the same object arriving in an inconsistent state - such as double or missing items in a shopping cart. And again, it's slow.
With a Redis hash key, you can atomically modify specific fields of the object without loading the entire object into memory. Instead of read, parse, modify, save - you just update.
Besides, Redis has many many data structures that can create very flexible data stores with different properties, whereas Memcache can only store strings.
BTW Redis has a module that allows you to store JSON objects just as you would a string, and manipulate them directly and atomically without getting them to the client. See Rejson.io for details.
Memcached doesn't support complex datastructures
In redis you have Lists, Sets, SortedSets, HashTables , and more.
Each data-structure mentioned above supports mutation of one or more of its elements atomically and without replacing the entire data-structure/value.
Memcached on the other hand , is a simple key-value store - that means every operation involving an attribute change within a complex object is a read-modify-write. If you just go around blindly replacing fields in objects then you are risking race-conditions and operations atomicity issues (which you can get away from by using CAS )
If the library abstracts that complexity, well - that's great but it's still less efficient than mutating only the relevant field(s)
This answer only relates to your usecase. Redis holds many other virtues over memcached, which are not relevant to this question.

cocoa: what's the best way of designing a persistent cache?

I have to download some info from the Internet, like what's the phone number of a person. I want to save the info in disk in order to load it when my application starts. So I want to know whether Core data is the best choice? I mean is it fast enough? I want to load the info into NSCache object, is it a good class I can use?
It is a Plist type caching; key->value, only strings. Easy coding. For a few data I would recommend this. described here
The other one with NSArchiver->NSData: binary storage, any type of data, but you have to deserialize and deserialize. More coding, no limits ( well, you are doing the transformation) . I do proffer this one, because during the development, maybe I will need later some other data than text. Usually need to cache images. presented here actually the good answer is with downvote!
If you are storing anything that will be used between launches of the application then using Core Data is the way to go unless you have really, really basic requirements. NSCache is better as a temporary cache that is used by the application as it is running and for data that can be recalculated if it does not already exist.

A Persistent Store for Increment/Decrementing Integers Easily and Quickly

Does there exist some sort of persistent key-value like store that allows for quick and easy incrementing, decrementing, and retrieval of integers (and nothing else). I know that I could implement something with a SQL database, but I see two drawbacks to that:
It's heavyweight for the task at hand. All I need is the ability to say "server[key].inc()" or "server[key].dec()"
I need the ability to handle potentially thousands of writes to a single key simultaneously. I don't want to deal with excessive resource contention. Change the value and get out - that's all I need.
I know memcached supports inc/dec, but it's not persistent. My strategy at this point is going to be to use a SQL server behind a queueing system of some sort such that there's only one process updating the database. It just seems... harder than it should be.
Is there something someone can recommend?
Redis is a key-value store that supports several data types. Integer is present, along with incr and decr commands.

Is there a fast and reliable way of serializing objects across different versions of Ruby?

I have two applications talking to each other using a queue, as of now they run exactly the same version of ruby (1.8.7), so I'm just marshaling objects back and forth; only objects from the standard lib mostly hashes, strings, time and date objects.
Right now I'm moving to Ruby 1.9.1, one app at the time, which means I'll be running one app with 1.8.7 and the other with 1.9.1 for a while. By running my tests I know Marshal will not be reliable across versions, I could use YAML, but it is much slower, JSON seems to be faster but it does not deal directly with the date/time objects.
Is there a reliable and fast way to serialize ruby objects across different versions?
I haven't tried it on ruby, but you could look at protocol buffers? Designed as a fast but portable binary format, it has a ruby port here. You would probably have to treat the generated types as a separate DTO layer, though (i.e. you map your existing data into the new types, rather than serialize your existing objects). Note that there is no inbuilt date-time support, but you could just use ticks in an epoch etc.
The key here is finding a common data type that you know will be represented the same across Ruby versions. The obvious choices here are storing data in an external database (the DB interface libraries will handle all the conversions) or writing the data out in a structured text format. If there's not a ton of data to work with (and the data is mostly standard types), I usually just store it as text; it takes longer to export/import but it's usually faster to write.
Protobufs are good, but require you to pre-define your data structures, if I recall. Thrift is similar to protobufs, but has some decent code generation features.
Apple's binary property list format sounds close to what you need. It's similar to JSON in behavior, but is more compact and supports a few extra types, including datetime and unencoded binary. There are a couple ruby implementations on github.
Your best bet may be BERT. BERT is based on Erlang's binary term serialization format. It's compact, includes datatime serialization and is implemented in a dozen or so languages, including ruby.

Resources