A way for fetcher.load to not participate in revalidation - remix.run

With the release of Remix.run v1.10, there's a note in the changelog:
fetcher.load calls now participate in revalidation - they always should have to avoid stale data on your page!
Is there a way to disable that?
I only want my fetcher to load once, when I manually trigger it.

Related

NextJS - ISR - How to Validate Forever Unless On-Demand Called

NextJS has a revalidation option.
revalidate: X
This is great for dynamically changing data for large sites with lots of visitors. But what if I don't change my data every minute, and don't want it to revalidate ever unless I manually call the on-demand revalidation through my app (like when a post gets updated).
I would like to cache my data unless the data gets manually updated in the database.
Can this be done, or is ISR not the best tool for the job?
Thanks,
J
If revalidate is omitted, Next.js will use the default value of false (no revalidation) and only revalidate the page on-demand when revalidate() is called.

Is there a Caffeine feature that will purge a particular item from the cache after defined time and recreates it at the same time?

ExpireAfter will only purge the item but will not re-create the item. So what I need to do is, after a predefined interval, I need to purge a particular item from the cache and at the same time I need to recreate it. It might recreate with same data if there is no change in the data. Assuming the data was changed, the recreating will give the latest object.
My idea was to retrieve latest item form the cache all the time. In contrast, the Refresh feature (https://github.com/ben-manes/caffeine/wiki/Refresh) will provide the stale item for the first request and does an asynchronous loading. So for the second request the cache will provide the latest object.
Asynchronous removal listener that re-fetches the expired entry
should work in my case. Can you please provide me some information on
how to achieve this?
I'm also curious to know how the scheduled task can do it?
Assuming cache can address the following two cases:
Subsequent requests case:
I understand the refreshAfterWrite will provide the stale entry for
the first time but for the second request, what happens if the cache
hasn't yet completed loading the expired entry?
Does cache blocks the second request, completes the re-fetch, and
then provide the latest value to the second request?.
The idea is to make the cache provides the latest data after the
defined entry expiry time.
In the case where the cache has to load values equal to its capacity at one shot:
Let say the cache size is 100 and the time to load all the 100 items
is 2 minutes.
Assuming the first request would load 100 items into the cache at the
same time, after the defined expiry time, the cache should evict and
re-fetch all the 100 elements.
For the second request to access items from those 100 items, how can
I make the cache smart enough so that it returns the entries that
have been re-loaded and asynchronously re-loads the other entries?.
The idea is not to block any request for an existing entry. Serve the
request for an existing entry and do the re-load for the remaining
expired entries.
Asynchronous removal listener that re-fetches the expired entry should work in my case. Can you please provide me some information on how to achieve this?
The removal listener requires a reference to the cache, but that is not available during construction. If it calls a private method instead then the uninitialized field isn't captured and it can be resolved at runtime.
Cache<K, V> cache = Caffeine.newBuilder()
.expireAfterWrite(1, TimeUnit.HOURS)
.removalListener((K key, V value, RemovalCause cause) -> {
if (cause == RemovalCause.EXPIRED) {
reload(key);
}
}).build();
private void reload(K key) {
cache.get(key, k -> /* load */);
}
I'm also curious to know how the scheduled task can do it?
If you are reloading all entries then you might not even need a key-value cache. In that case the simplest approach would be to reload an immutable map.
volatile ImmutableMap<K, V> data = load();
scheduledExecutorService.scheduleAtFixedRate(() -> data = load(),
/* initial */ 1, /* period */ 1, TimeUnit.HOURS);
I understand the refreshAfterWrite will provide the stale entry for the first time but for the second request, what happens if the cache hasn't yet completed loading the expired entry?
The subsequent requests obtain the stale entry until either (a) the refresh completes and updates the mappings or (b) the entry was removed and the caller must reload. The case of (b) can occur if the entry expired while the refresh was in progress, where returning the stale value is no longer an option.
Does cache blocks the second request, completes the re-fetch, and then provide the latest value to the second request?.
No, the stale but valid value is returned. This is to let the refresh hide the latency of reloading a popular entry. For example an application configuration that is used by all requests would block when expired, causing periodic delays. The refresh would be triggered early, reload, and the callers would never observe it absent. This hides latencies, while also allowing idle entries to expire and fade away.
In the case where the cache has to load values equal to its capacity at one shot... after the defined expiry time, the cache should evict and re-fetch all the 100 elements.
The unclear part of your description is if the cache reloads only the entries being accessed within the refresh period or if it reloads the entire contents. The former is what Caffeine offers, while the latter is better served with an explicit scheduling thread.

Ensure consistency when caching data in after_commit hooks

For a specific database table, we need an in-memory cache of this data that is always in-sync with the database. My current attempt is to write the changes to the cache in an after_commit hook - this way we make sure not to write any changes to the cache that could get reverted later.
However, this strategy is vulnerable to the following scenario:
Thread A locks and updates record, stores value 1
Thread A commits the change
Thread B locks and updates record, stores value 2
Thread B commits the change
Thread B runs the after_commit hook, so the cache now has value 2
Thread A runs the after_commit hook, so the cache now has value 1 but should have value 2
Am I right about this problem and how would one solve this?
You are right about this problem.
There is a after_save callback that runs within the same transaction. You might want to use that one instead of the after_commit hook that run after the transaction.
But than you will need to deal with a rolled back transaction yourself.
Or you might want to write your caching method in a way that does not depend on a specific instance. But instead caches the latest version that is found in the database by reloading the record from the database first.
But even than: Multithreaded systems are hard to keep in sync. And you cannot even ensure if the first or the second update send to your cache would be stored, because the caching system might be multi-threaded too.
You might want to read about different consistency models.
The solution we came up with is to lock the cache for read / write before_commit and unlock it in the after_commit. This seems to do the trick.

Is there a way to keep ajax calls from firing off seemingly sequentially in web2py?

I'm developing an SPA and find myself needing to fire off several (5-10+) ajax calls when loading some sections. With web2py, it seems that many of them are waiting until others are done or near done to get any data returned.
Here's an example of some of Chrome's timeline output
Where green signifies time spent waiting, gray signifies time stalled, transparent signifies time queued, and blue signifies actually receiving the content.
These are all requests that go through web2py controllers, and most just do a simple operation (usually a database query). Anything that accesses a static resource seems to have no trouble being processed quickly.
For the record, I'm using sessions in cookies, since I did read about how file-based sessions force web2py into similar behavior. I'm also calling session.forget() at the top of any controller that doesn't modify the session.
I know that I can and I intend to optimize this by reducing the number of ajax calls, but I find this behavior strange and undesirable regardless. Is there anything else that can be done to improve the situation?
If you are using cookie based sessions, then requests are not serialized. However, note that browsers limit the number of concurrent connections to the same host. Looking at the timeline output, it does look like groups of requests are indeed made concurrently, but Chrome will not make all 21 requests concurrently.
If you can't reduce the number of requests but must make them all concurrently, you could look into domain sharding or configuring your web server to use HTTP/2.
As an aside, in web2py, if you are using file based sessions and want to unlock the session file within a given request in order to prevent serialization of requests, you must use session.forget(response) rather than just session.forget() (the latter prevents the session from being saved even if it has been changed, but it does not immediately unlock the file). In any case, there is no session file to unlock if you are using cookie based sessions.

What is it called when two requests are being served from the same cache?

I'm trying to find the technical term for the following (and potential solutions), in a distributed system with a shared cache:
request A comes in, cache miss, so we begin to generate the response
for A
request B comes in with the same cache key, since A is not
completed yet and hasn't written the result to cache, B is also a
cache miss and begins to generate a response as well
request A completes and stores value in cache
request B completes and stores value in cache (over-writing request A's cache value)
You can see how this can be a problem at scale, if instead of two requests, you have many that all get a cache miss and attempt to generate a cache value as soon as the cache entry expires. Ideally, there would be a way for request B to know that request A is generating a value for the cache, and wait until that is complete and use that value.
I'd like to know the technical term for this phenomenon, it's a cache race of sorts.
It's a kind of Thundering Herd
Solution: when first request A comes and fills a flag, if request B comes and finds the flag then wait... After A loaded the data into the cache, remove flag.
If all other request are waked up by the cache loaded event, would trigger all thread "Thundering Herd". So also need to care about the solution.
For example in Linux kernel, only one process would be waked up, even several process depends on the event.

Resources