NextJS - ISR - How to Validate Forever Unless On-Demand Called - caching

NextJS has a revalidation option.
revalidate: X
This is great for dynamically changing data for large sites with lots of visitors. But what if I don't change my data every minute, and don't want it to revalidate ever unless I manually call the on-demand revalidation through my app (like when a post gets updated).
I would like to cache my data unless the data gets manually updated in the database.
Can this be done, or is ISR not the best tool for the job?
Thanks,
J

If revalidate is omitted, Next.js will use the default value of false (no revalidation) and only revalidate the page on-demand when revalidate() is called.

Related

State vs cookie/localstorage read performance

I am developing a app in React + Redux and I have a constant doubt and I can't find documentation about it. Is there any performance downside if, let's say in a saga, I read data from a cookie/localStorage instead from the state? This read process would only happen once on each load.
The key thing here is the performance, without taking into consideration if it's good or bad practice.
Thank in advance.
First of all - what do you mean state ? In redux - state is just a plain object (plus some methods, but still). So when you read data from there - you just read props from object.
While cookies, localstorage - it's DOM api, which first of all slower, plus you need not only read data, but also parse it (cause both cookies, storage work with serialized data). So definitely storage/cookie slower than state.
You can check http://jsben.ch/nvo5G
BUT! - you can't save in-memory object state between page reloads. So for this, you can use storage (pattern named persistent state. And there is probably no other way to implement this functionality (or client DB) - in case you need to restore some state on reload - you have just two options - save state on a client (cookies, storage/db), or on server (and do fetch request).
It's MICRO optimisations, mostly you shouldn't care about it (in the case of reading just on start app)

Is there a way to keep ajax calls from firing off seemingly sequentially in web2py?

I'm developing an SPA and find myself needing to fire off several (5-10+) ajax calls when loading some sections. With web2py, it seems that many of them are waiting until others are done or near done to get any data returned.
Here's an example of some of Chrome's timeline output
Where green signifies time spent waiting, gray signifies time stalled, transparent signifies time queued, and blue signifies actually receiving the content.
These are all requests that go through web2py controllers, and most just do a simple operation (usually a database query). Anything that accesses a static resource seems to have no trouble being processed quickly.
For the record, I'm using sessions in cookies, since I did read about how file-based sessions force web2py into similar behavior. I'm also calling session.forget() at the top of any controller that doesn't modify the session.
I know that I can and I intend to optimize this by reducing the number of ajax calls, but I find this behavior strange and undesirable regardless. Is there anything else that can be done to improve the situation?
If you are using cookie based sessions, then requests are not serialized. However, note that browsers limit the number of concurrent connections to the same host. Looking at the timeline output, it does look like groups of requests are indeed made concurrently, but Chrome will not make all 21 requests concurrently.
If you can't reduce the number of requests but must make them all concurrently, you could look into domain sharding or configuring your web server to use HTTP/2.
As an aside, in web2py, if you are using file based sessions and want to unlock the session file within a given request in order to prevent serialization of requests, you must use session.forget(response) rather than just session.forget() (the latter prevents the session from being saved even if it has been changed, but it does not immediately unlock the file). In any case, there is no session file to unlock if you are using cookie based sessions.

Varnish: purge cache every time user hits "like" button

I need to implement like/dislike functionality (for anonymous users so there is no need to sign up). Problem is that content is served by Varnish and I need to display actual number of likes.
I'm wondering how it's done on website like stackoverflow. Assuming pages are cached in Varnish (for anonymous users only), so every time user votes on answer/question, page needs to be purged from cache. Am I right? Current number of votes needs to be visible for other users.
What is good approach in this situation? Should I send PURGE to Varnish every time user hits "like" button?
A common way of implementing this is to do the like button and display client side in Javascript instead. This avoids the issue slightly.
Assuming that pressing Like leads to a POST request hitting a single Varnish server, you can make the object be invalidated/replaced in different ways. Using purge and a VCL restart is most likely the better way to do this.
Of course there is a slight race here, where other clients will be served the old page while this is ongoing.

Why request latency increases when adding more async requests, resulting in ridiculous load time?

I have MVC app that is basically 1 main view and multiple partial views.
I have this tiny script on a min page that loads partials views asynchronously:
<script>
$(function () {
$('#bookmarks').load('Home/Bookmarks');
$('#calendar').load('Home/Calendar');
$('#gmail').load('Home/Gmail');
$('#weather').load('Home/Weather');
$("#news").load("Home/News");
}
</script>
When I comment them all, I get this (very fast loading):
Now I uncomment just the "Bookmarks" request (just reads small JSON file from local drive), I get this (Bookmarks takes 9ms):
Now I uncomment the "Calendar" request (Google Calendar API), I get this (why Bookmarks latency jumps from 9ms to 1.04s if the requests are async?):
Now I uncomment the "Gmail" request (Google Gmail API), I get this (Bookmarks latency jumps again from from 1.04s to 1.53s?):
Now I uncomment the rest ("Gmail", "Weather" and "News" requests), I get these insane increased latencies all over, the Bookmarks request takes now 5s to execute, instead of 9ms - why?):
You can see the increase in latency for each operation - it looks like these ajax requests are not asynchronous at all :( How is that possible, when AJAX is supposed to be async by default?
I am sure I am missing something here, may jQuery load function is not async, but it's on javascript size, and the delay is on server-side. I am now confused.
Update: obviously jQuery call are async, all load functions are executed at the same time. It's easy. The problem is on the server side. After having dome some tests, it's clear that IIS executes these request synchronously, sequentially one after another in the order it received them from the browser. I have done some additional reading on IIS, and by default IIS apppool has only 1 worker process that is being used, that can fire multiple threads for processing all these requests. But for some reason, the requests are being processed sequentially and not in parallel. I haven't found yet why (if a AppPool worker process can start many threads for simultaneous processing) the requests are still executed sequentially, and how to make these requests to be processed in parallel, and if it's even possible. If someone has any idea how to make thing work properly I would really like to hear. Thanks.
Update After some more reading, found that requests are processed sequentially if Session is enabled. The session object is a single threaded object. The session object cannot be shared by two threads simultaneously. Hence when there are two requests for the same session one is queued while the session object is in use by the other. This sucks :( Any suggestions, IIS experts? :)
Solved
Yes, Session State was the problem! I disabled session state in web.config and removed one line that used Session object to avoid runtime error. Now everything works perfect. The true problem was indeed - that Session state is bound to a single thread, therefore on the server side my app behaved like an old STA fart :)
http://msdn.microsoft.com/en-us/library/ms178581.aspx
Concurrent Requests and Session State
Access to ASP.NET session state is exclusive per session, which means that if two different users make concurrent requests, access to each separate session is granted concurrently. However, if two concurrent requests are made for the same session (by using the same SessionID value), the first request gets exclusive access to the session information. The second request executes only after the first request is finished. (The second session can also get access if the exclusive lock on the information is freed because the first request exceeds the lock time-out.) If the EnableSessionState value in the # Page directive is set to ReadOnly, a request for the read-only session information does not result in an exclusive lock on the session data. However, read-only requests for session data might still have to wait for a lock set by a read-write request for session data to clear.
After some more reading, found that requests are processed
sequentially if Session is enabled. The session object is a single
threaded object. The session object cannot be shared by two threads
simultaneously. Hence when there are two requests for the same session
one is queued while the session object is in use by the other. This
sucks :( Any suggestions, IIS experts? :)
Don't use Session. I rarely find cases that justify use of Session, and it's often easy to find storage alternatives that don't have the scalability limitations that you run into with it.

Dealing with concurrency issues when caching for high-traffic sites

I was asked this question in an interview:
For a high traffic website, there is a method (say getItems()) that gets called frequently. To prevent going to the DB each time, the result is cached. However, thousands of users may be trying to access the cache at the same time, and so locking the resource would not be a good idea, because if the cache has expired, the call is made to the DB, and all the users would have to wait for the DB to respond. What would be a good strategy to deal with this situation so that users don't have to wait?
I figure this is a pretty common scenario for most high-traffic sites these days, but I don't have the experience dealing with these problems--I have experience working with millions of records, but not millions of users.
How can I go about learning the basics used by high-traffic sites so that I can be more confident in future interviews? Normally I would start a side project to learn some new technology, but it's not possible to build out a high-traffic site on the side :)
The problem you were asked on the interview is the so-called Cache miss-storm - a scenario in which a lot of users trigger regeneration of the cache, hitting in this way the DB.
To prevent this, first you have to set soft and hard expiration date. Lets say the hard expiration date is 1 day, and the soft 1 hour. The hard is one actually set in the cache server, the soft is in the cache value itself (or in another key in the cache server). The application reads from cache, sees that the soft time has expired, set the soft time 1 hour ahead and hits the database. In this way the next request will see the already updated time and won't trigger the cache update - it will possibly read stale data, but the data itself will be in the process of regeneration.
Next point is: you should have procedure for cache warm-up, e.g. instead of user triggering cache update, a process in your application to pre-populate the new data.
The worst case scenario is e.g. restarting the cache server, when you don't have any data. In this case you should fill cache as fast as possible and there's where a warm-up procedure may play vital role. Even if you don't have a value in the cache, it would be a good strategy to "lock" the cache (mark it as being updated), allow only one query to the database, and handle in the application by requesting the resource again after a given timeout
You could probably be better of using some distributed cache repository, as memcached, or others depending your access pattern.
You could use the Cache implementation of Google's Guava library if you want to store the values inside the application.
From the coding point of view, you would need something like
public V get(K key){
V value = map.get(key);
if (value == null) {
synchronized(mutex){
value = map.get(key);
if (value == null) {
value = db.fetch(key);
map.put(key, value);
}
}
}
return value;
}
where the map is a ConcurrentMap and the mutex is just
private static Object mutex = new Object();
In this way, you will have just one request to the db per missing key.
Hope it helps! (and don't store null's, you could create a tombstone value instead!)
Cache miss-storm or Cache Stampede Effect, is the burst of requests to the backend when cache invalidates.
All high concurrent websites I've dealt with used some kind of caching front-end. Bein Varnish or Nginx, they all have microcaching and stampede effect suppression.
Just google for Nginx micro-caching, or Varnish stampede effect, you'll find plenty of real world examples and solutions for this sort of problem.
All boils down to whether or not you'll allow requests pass through cache to reach backend when it's in Updating or Expired state.
Usually it's possible to actively refresh cache, holding all requests to the updating entry, and then serve them from cache.
But, there is ALWAYS the question "What kind of data are you supposed to be caching or not", because, you see, if it is just plain text article, which get an edit/update, delaying cache update is not as problematic than if your data should be exactly shown on thousands of displays (real-time gaming, financial services, and so on).
So, the correct answer is, microcache, suppression of stampede effect/cache miss storm, and of course, knowing which data to cache when, how and why.
It is worse to consider particular data type for caching only if data consumers are ready for getting stale date (in reasonable bounds).
In such case you could define invalidation/eviction/update policy to keep you data up-to-date (in business meaning).
On update you just replace data item in cache and all new requests will be responsed with new data
Example: Stocks info system. If you do not need real-time price info it is reasonable to keep in cache stock and update it every X mils/secs with expensive remote call.
Do you really need to expire the cache. Can you have an incremental update mechanism using which you can always increment the data periodically so that you do not have to expire your data but keep on refreshing it periodically.
Secondly, if you want to prevent too many users from hiting the db in one go, you can have a locking mechanism in your stored proc (if your db supports it) that prevents too many people hitting the db at the same time. Also, you can have a caching mechanism in your db so that if someone is asking for the exact same data from the db again, you can always return a cached value
Some applications also use a third service layer between the application and the database to protect the database from this scenario. The service layer ensures that you do not have the cache miss storm in the db
The answer is to never expire the Cache and have a background process update cache periodically. This avoids the wait and the cache-miss storms, but then why use cache in this scenario?
If your app will crash with a "Cache miss" scenario, then you need to rethink your app and what is cache verses needed In-Memory data. For me, I would use an In Memory database that gets updated when data is changed or periodically, not a Cache at all and avoid the aforementioned scenario.

Resources