When I set timeToLiveSeconds="100" mean that the EhCache engine will reset all the cache or will reset only the elements that is living for 100 seconds ?
I've read the EhCache's documentation and it tends to be the first approach, by the way, I'm not totally sure about that:
timeToLiveSeconds This is an optional attribute.
Legal values are integers between 0 and Integer.MAX_VALUE.
It is the number of seconds that an Element should live since it was
created. Created means inserted into a cache using the Cache.put
method.
0 has a special meaning, which is not to check the Element for time to
live, i.e. it will live forever.
The default value is 0.
Thank you.
It will reset only the element. Checkout the source code. getExpirationTime() method belongs to Element class.
http://grepcode.com/file/repo1.maven.org/maven2/net.sf.ehcache/ehcache-core/2.5.0/net/sf/ehcache/Element.java#Element.getExpirationTime%28%29
Related
I just want to do rate limiting on a rest api using redi. Could you please suggest me, which datastructure in redis would be appropriate. I just used the RedisTemplate which is not feasible to expire an element, once after updating a key and value.
There are multiple approaches, depending on what exactly you are trying to achieve - from general "ops per second" limiting, to fine grained limits in a lower resolution, like how many posts a specific user can make per day, etc.
One very simple and elegant approach I like is an expiring counter. The technique is simple, and takes advantage of the fact that INCR does not change a key's expiration time in redis. So basically if you want 1000 requests per second on a resource, just create a key with the number 1 (by running INCR) and expire it in a second. Then for each request check if it's reached 1000 and if not increment it. If it has - block the request. When the time window has passed the key will expire automatically and will be recreated on the next request.
In terms of pseudo code, the algorithm is:
def limit(resource_key):
current = GET(resource_key)
if current != NULL and current >= 1000:
return ERROR
else:
value = INCR(resource_key)
IF value == 1:
EXPIRE(value,1)
return OK
I found 4 "proper" ways to do this:
In the cheat sheet for ActiveRecord users substitutes for ActiveRecord's increment and increment_counter are supposed to be album.values[:column] -= 1 # or += 1 for increment and album.update(:counter_name=>Sequel.+(:counter_name, 1))
In a SO solution update_sql is suggested for the same effect s[:query_volume].update_sql(:queries => Sequel.expr(3) + :queries)
In a random thread I found this one dataset.update_sql(:exp => 'exp + 10'.lit)
In the Sequels API docs for update I found this solution http://sequel.jeremyevans.net/rdoc/classes/Sequel/Dataset.html#method-i-update
yet none of the solutions actually update the value and return the result in a safe, atomic way.
Solutions based on "adding a value and then saving" should, afaik, fail nondeterministically in multiprocessing environments resulting with errors such as:
album's counter is 0
thread A and thread B both fetch album
thread A and thread B both increment the value in the hash/model/etc
thread A and thread B both update the counter to same value
as a result: A and B both set the counter to 1 and work with counter value 1
Sequel.expr and Sequel.+ on the other hand don't actually return a value, but a Sequel::SQL::NumericExpression and (afaik) you have no way of getting it out short of doing another DB roundtrip, which means this can happen:
album's counter is 0
thread A and B both increment the value, value is incremented by 2
thread A and B both fetch the row from the DB
as a result: A and B both set the counter to 2 and work with counter value 2
So, short of writing custom locking code, what's the solution? If there's none, short of writing custom locking code :) what's the best way to do it?
Update 1
I'm generally not happy with answers saying that I want too much of life, as 1 answer suggests :)
The albums are just an example from the docs.
Imagine for example that you have a transaction counter on an e-commerce POS which can accept 2 transactions at the same time on different hosts and to the bank you need to send them with an integer counter unique in 24h (called systan), send 2 trx with same systan and 1 will be declined, or worse, gaps in the counts are alerted (because they hint at "missing transactions") so it's not possible to use the DB's ID value.
A less severe example, but more related to my use case, several file exports get triggered simultaneously in a background worker, every file destination has its own counter. Gaps in the counters are alerted, workers are on different hosts (so mutexes are not useful). And I have a feeling I'll soon be solving the more severe problem anyway.
The DB sequences are no good either because it would mean doing DDL on addition of every terminal, and we're talking 1000s here. Even in my less sever use case DDLing on web portal actions is still a PITA, and might even not work depending on the cacheing scheme below (due to implementation of ActiveRecord and Sequel - and in my case I use both - might require server restart just to register a merchant).
Redis can do this, but it seems insane to add another infrastructure component just for counters when you're sitting on an ACID-compliant database.
If you are using PostgreSQL, you can use UPDATE RETURNING: DB[:table].returning(:counter).update(:counter => Sequel.expr(1) + :counter)
However, without support for UPDATE RETURNING or something similar, there is no way to atomically increment at the same time as return the incremented value.
The answer is - in a multithreaded environment, don't use DB counters. When faced with this dilema:
If I need a unique integer counter, use a threadsafe counter generator that parcels out counters as threads require them. This can be a simple integer or something more complex like a Twitter Snowflake-like generator.
If I need a unique identifier, I use something like a uuid
In your particular situation, where you need a count of albums - is there a reason you need this on the database rather than as a derived field on the model?
Update 1:
Given that you're dealing with something approximating file exports with workers on multiple hosts, you either need to parcel out the ids in advance (i.e. seed a worker with a job and the next available id from a single canonical source) or have the workers call in to a central service which allocates transaction ids on a first come first served basis.
I can't think of another way to do it. I've never worked with a POS system, but the telecoms network provisioning systems I've worked on have generally used a single transaction generator service which namespaced ids as appropriate.
I have a Guava cache which I would like to expire after X minutes have passed from the last access on a key. However, I also periodically do an action on all the current key-vals (much more frequently than the X minutes), and I wouldn't like this to count as an access to the key-value pair, because then the keys will never expire.
Is there some way to read the value of the keys without this influencing the internal state of the cache? ie cache._secretvalues.get(key) where I could conceivably subclass Cache to StealthCache and do getStealth(key)? I know relying on internal stuff is non-ideal, just wondering if it's possible at all. I think when I do cache.asMap.get() it still counts as an access internally.
From the official Guava tutorials:
Access time is reset by all cache read and write operations (including
Cache.asMap().get(Object) and Cache.asMap().put(K, V)), but not by
containsKey(Object), nor by operations on the collection-views of
Cache.asMap(). So, for example, iterating through cache.entrySet()
does not reset access time for the entries you retrieve.
So, what I would have to do is iterate through the entrySet instead to do my stealth operations.
Is it possible to configure or extend ehCache to satisfy the following requirements?
Cache an element with a time to live.
When an element is requested from the cache and the time to live has been exceeded, attempt to refresh that value, however if the look up fails use the previous value
The first of these is fairly obvious, but I don't see a way of satisfying the second condition.
Not without overriding Cache.searchInStoreWith/WithoutStats methods. Thing is that currently it's implemented the way that firstly element is evicted from underlying Store and only then, if it's configured, CacheListener is called and even then only the key is passed, not the value.
Of course to it would be possible to configure CacheWriter and not to evict expired elements(and even update them), but without overriding Chache.get your call would still return null. So it's kind of possible to hack around and get hold of expired element
Though to me it seems that it would be easy to change implementation so that expired element wouldn't be evicted, but instead CacheLoaded could be invoked. I'm planning on doing just that asynchronously since I'm better off with stale data than waiting too long for response from SOR or having no data at all if SOR wouldn't be reachable.
Seems like something similar is implemented to comply with JSR 107, but it does not distinguish between non existent and expired elements and if look up fails - expired element is gone.
I'm running APC mainly to cache objects and query data as user cache entries, each item it setup with a specific time relevant to the amount of time it's required in the cache, some items are 48 hours but more are 2-5 minutes.
It's my understanding that when the timeout is reached and the current time passes the created at time then the item should be automatically removed from the user cache entries?
This doesn't seem to be happening though and the items are instead staying in memory? I thought maybe the garbage collector would remove these items but it doesn't seem to have done even though it's running once an hour at the moment.
The only other thing I can think is that the default apc.user_ttl = 0 overrides the individual timeout values and sets them to never be removed even after individual timeouts?
In general, a cache manager SHOULD keep your entries for as long as possible, and MAY delete them if/when necessary.
The Time-To-Live (TTL) mechanism exists to flag entries as "expired", but expired entries are not automatically deleted, nor should they be, because APC is configured with a fixed memory size (using apc.shm_size configuration item) and there is no advantage in deleting an entry when you don't have to. There is a blurb below in the APC documentation:
If APC is working, the Cache full count number (on the left) will
display the number of times the cache has reached maximum capacity and
has had to forcefully clean any entries that haven't been accessed in
the last apc.ttl seconds.
I take this to mean that if the cache never "reached maximum capacity", no garbage collection will take place at all, and it is the right thing to do.
More specifically, I'm assuming you are using the apc_add/apc_store function to add your entries, this has a similar effect to the apc.user_ttl, for which the documentation explains as:
The number of seconds a cache entry is allowed to idle in a slot in
case this cache entry slot is needed by another entry
Note the "in case" statement. Again I take this to mean that the cache manager does not guarantee a precise time to delete your entry, but instead try to guarantee that your entries stays valid before it is expired. In other words, the cache manager puts more effort on KEEPING the entries instead of DELETING them.
apc.ttl doesn't do anything unless there is insufficient allocated memory to store new coming variables, if there is sufficient memory the cache will never expire!!. so you have to specify your ttl for every variable u store using apc_store() or apc_add() to force apc to regenerate it after end of specified ttl passed to the function. if u use opcode caching it will also never expire unless the page is modified(when stat=1) or there is no memory. so apc.user_ttl or apc.ttl are actually have nothing to do.