Using memcached to throttle connections - ruby

I'm trying to understand how rack-attack uses memcached to throttle connections.
As far as I can tell there's no easy way to manage lists in memcached, and no way to search keys by prefix. Yet rack-attack is somehow keeping a list count within cache, but I'm staring at the source code and can't figure out how it works?
https://github.com/kickstarter/rack-attack/blob/master/lib/rack/attack/throttle.rb
https://github.com/kickstarter/rack-attack/blob/master/lib/rack/attack/cache.rb

It's possible to emulate namespacing, tagging and indexing with memcached, which allows you to work around many limitations (in your case you could maintain prefixes as tags). This article has some good ideas, and the memcached docs have some neat tricks too.

Related

How to remove/flush specific keyset's from MemCache server?

I have few key's stored in the MemCached server. Like...
KEY-2312sdasd78
KEY-5lk65klk343
KEY-klk34k3lkl3
TEST-34k3l4k3l4
TEST-kl3k2lk3l2
Now, I want to remove the key's from MemCached server which are start with "KEY".
I have tried to find google but there is no RegEX based support in MemCached.
Does anybody faced this kind of issues, and what is the optimum work around for this.
Any help will be appreciated. Thanks.
Possible duplicate: Regex on memcached key?
Also See http://code.google.com/p/memcached-tag/
I think something like this is much easier with something like Redis because it:
Supports Transactions
Supports atomic data structures like Lists
So in Redis when you add a key,value you will add the key to some giant global list in the same transaction.
There's no way to do this without knowing that the keys are.
The only way that you could do something like this is by prefixing each set of keys with something common, e.g. KEY-KEYSET1-. You could then invalidate them all by internally bumping 1 to 2 in your code, which means that the existing values will not be accessed and eventually expire.

memcached usage patterns

I'm planning the injection of a caching system within my website, will use it in different layers (data, presentation and may be somewhere else). Being my stack LAMP and my infrastructure 100% cloud on AWS, I thought the natural choice would be Amazon Elasticache (a managed installation of memcached). But...
Surprisingly - for me - I discovered memcached completely lacks of dependency management. I don't need "advanced" stuffs like ASP.Net cache SqlDependency or FileDependency, but memcached doesn't offer an easy other-key dependency neither, something pretty useful for building a dependency tree that greatly simplify the invalidation process.
So, as I know memcached is used in many complex systems, am I missing something? Are there usage patterns that make this lack irrelevant?
thanks
UPDATE
as asked, I add some pseudo code to clarify what I mean
dependency = 'ROOT_KEY';
cache:set(dependency, 0, NEVER_EXPIRE);
expire = 600;
cache:set('key1', obj1, expire, dependency);
cache:set('key2', obj2, expire, dependency);
...
cache:set('keyN', objN, expire, dependency);
//later, when I have to invalidate
cache:remove(dependency); //this will cause all keyX to be invalidated too
Based on the example in your question, memcached (and thus Elastic Cache) does not support any sort of key metadata like you are looking for by which you could relate such keys and operate on them as a group.
I suppose if you had only a handful of different "dependencies" you could simply utilize multiple elastic cache instances, which would allow you to invalidate all items within each instance/dependency simultaneously. This of course might end up costing you more in terms of AWS hardware costs then your would like since you can only increment your cache sizes in discrete amounts. This also would eliminate the ability for you to do a cache lookup without knowing the dependency/instance upon which the lookup is to occur.
For what you are trying to do, you might be able to use something like memory tables in MySQL/RDS if you are looking for more of a works-out-of-the-box type of solution. Of course you would not want to use RDS high-availibility features or point-in-time restoration, as these will break, since they require writing to disk. You would basically need to have a standalone RDS instance doing nothing but these memory tables.
It seems none of these options however is really an exact fit for what you are looking to do, so you might need to look into either adjusting your approach (if you want to use basic AWS components), or deploying an alternate caching system on EC2.

Memcache global expiration change

Is it possible to change all the key/value pairs in memcache instances with a command line?
Say, I have 10 memcache servers and they have key value pairs, and they all have the objects with 30 days expiration. But they don't expire at the same time, and I don't want all of them to expire at the same time. I want to change the objects to expire in 10 days. How can I make this change?
Is this even possible?
Can this be done via a commandline? Do I have to write a program for this?
You can accomplish this by touching values periodically. The FAQ describes a way to do this.
However, memcache isn't designed for this. What you're doing seems to be more like a persistent cache scenario. If you love memcache semantics, Membase and MemcacheDB provide solutions that may better fit your needs. There are many different persistent cache systems that do this just as well.
Depending on your specs, sometimes speeding up your data source may deliver better performance than memcache. Modern DMBSs cache heavily with sensible access protocols. This is entirely dependent on what your data sources look like and how much flexibility you have in your system design.
Memcache has a telnet interface. Then you can use FLUSH_ALL or FLUSH_ALL <seconds_to_wait>, if that's what you mean...

Cache vs HashMap for simple usecase

This must be a very basic:- Just curious, If I don't need distributed, cache-as-sor models, why do we need third party cache libraries (ehcache, memcached) when all you need (for simple use case) is just a key-value pair holder, something like HashMap ?
A lot of thought goes into producing software, and the more thought and testing by others (and fixes) improves the value of the software and also validates the code as a model (I didn't say a good model).
For the example, above, how would you handle the deleting of "old" cache items? You would have to add more code/features to insure that the cache could be emptied.
Using memcache may be overkill for a simple program, but it's already solved many of the problems that you will have and gives you a bit of extra ability.
I would also use Redis as an example. You can DO a lot of stuff in your own language, but sometimes, Redis would make other items easier.
YMMV!
-daniel

Organizing memcache keys

Im trying to find a good way to handle memcache keys for storing, retrieving and updating data to/from the cache layer in a more civilized way.
Found this pattern, which looks great, but how do I turn it into a functional part of a PHP application?
The Identity Map pattern: http://martinfowler.com/eaaCatalog/identityMap.html
Thanks!
Update: I have been told about the modified memcache (memcache-tag) that apparently does do a lot of this, but I can't install linux software on my windows development box...
Well, memcache use IS an identity map pattern. You check your cache, then you hit your database (or whatever else you're using). You can go about finding information about the source by storing objects instead of just values, but you'll take a performance hit for that.
You effectively cannot ask the cache what it contains as a list. To mass invalidate, you'll have to keep a list of what you put in and iterate it, or you'll have to iterate every possible key that could fit the pattern of concern. The resource you point out, memcache-tag can simplify this, but it doesn't appear to be maintained inline with the memcache project.
So your options now are iterative deletes, or totally flushing everything that is cached. Thus, I propose a design consideration is the question that you should be asking. In order to get a useful answer for you, I query thus: why do you want to do this?

Resources