Which does stale-while-revalidate cache strategy mean? - caching

I am trying to implement different cache strategies using ServiceWorker. For the following strategies the way to implement is completely clear:
Cache first
Cache only
Network first
Network only
For example, while trying to implement the cache-first strategy, in the fetch hook of the service-worker I will first ask the CacheStorage (or any other) for the requested URL and then if exists respondWith it and if not respondWith the result of network request.
But for the stale-while-revalidate strategy according to this definition of the workbox, I have the following questions:
First about the mechanism itself. Does stale-while-revalidate mean that use cache until the network responses and then use the network data or just use the network response to renew your cache data for the next time?
Now if the network is cached for the next time, then what scenarios contain a real use-case of that?
And if the network response should be replaced immediately in the app, so how could it be done in a service worker? Because the hook will be resolved with the cached data and then network data could not be resolved (with respondWith).

Yes, it means exactly that. The idea is simple: respond immediately from the cache, then refresh the cache in the background for the next time.
All scenarios where it is not important to always get the very latest version of the page/app =) I'm using stale-while-revalidate strategy on two different web applications, one for public transportation services and one for displaying restaurant menu information. Many sites/apps are just fine with this but of course not all.
One very important thing to note here on the #2:
You could eg. use stale-while-revalidate only for static assets. This way your html, js, css, images etc. would be cached and quickly served to the user, but the data fetched dynamically from an API could still be fresh. For some apps this works, for some others not so well. Depends completely on the app. Of course you have to remember not to change the semantics of your API if the user is running a previous version of the app etc.
Not possible in any automatic way. What you could do, however, is implement a msg channel between the Service Worker and the "regular JS code on the page" using window.postMessage API. You could listen for certain messages on the page and then, from the Service Worker, send a msg when an important change has happened and the cache has been updated. Then you could either show the user a prompt telling that the page really needs to be reloaded right now or even force reload it from JS. You would need to put this logic of determining when an important update has happened into the Service Worker of course.

Related

How to implement synchronization of browser-based online games when users refresh their browser

In implementing a browser-based simple game involving multiple users, I have the server save the game state at certain sync points (not time-based but event-specific). I identify each state by an integer.
When a user refreshes his browser, the server provides the latest state and restores the content in the browser. However, in those few seconds while the browser is loading the latest content after browser-refresh, the state could change again. I do not know how to handle this situation because sending the next state will again raise the same issue.
I want a seamless refresh so none of the other players are impacted when one user refreshes his browser (or for that matter leaves and comes back).
The implementation language is not relevant. I use websockets to communicate between the browser and the server. The server is the intermediary for all communication between users (I am not using WebRTC data channels). What is the best way to sync the application content in multiple browsers?
This is indeed a programming-based question though no code is provided.
Forget the fact that your client exists in a browser. Let's just talk about replication.
The usual approach in databases is to separate snapshots from Write Access Logging (WAL) logs. When you bring a new client up, you select a snapshot and transfer that. Then when the client is ready it asks for WAL logs from that snapshot forward. The same mechanism is used after crashes. The last available snapshot is loaded, then the WAL log is replayed, then the database comes up.
I would suggest the same strategy. This does require efficient storage of snapshots. Some kind of log. And some kind of replay mechanism. Which is a lot of easy to mess up code. If you can use something existing, that would be good.
The first thing that I looked into was using Emscripten to compile Redis to JS, and then try to use Redis' built-in asynchronous replication to replicate to your browser. That may be possible, but the fact that Redis is single-threaded and wants to be a client-server is probably a showstopper.
The next best option that I found is that you can use https://isomorphic-git.org/. Here is how that could build what you need. You simply maintain your current state in a git repository, and keep a WAL log of everything that you've done with it. When a client connects, it clones the repository. Once done it connects to the websocket, tells you what commit it is at, and you send it the WAL log from that point forward. Locally in the browser you run those git commands. If the client simply loses its connection and then rejoins, it can do a git pull, and then follow the same strategy.
This will be a bunch of work for you. But a lot less work than implementing everything from scratch.

From a purely caching point of view, is there any advantage using the new Cache API instead of regular http cache?

The arrival of service workers has led to a great number of improvements to the web. There are many use cases for service workers.
However, from a purely caching point of view, does it makes sense to use the Cache API?
Many approaches make assumptions of how resources will be handled.
Often only the URL is used to determine how the resource should be handled with strategies such as Network first, Network Only, Stale-while-revalidate, Cache first and Cache only. This can be tedious work, because you have to define a specific handler for many URLs. It's not scalable.
Instead I was thinking of using regular HTTP cache in combination with the Cache API. Response headers contain useful information that can be used to cache and verify if the cache can still be used or if a new version would be available. Together with best practice caching (for example https://jakearchibald.com/2016/caching-best-practices/), this could create a generic service worker that has not te be updated when resources change.
Based on the response headers, a resource could be handled by a custom handler. If the headers would ever be updated, it would be possible to handle the resource with a different handler if necessary.
But then I realised, I was just reimplementing browser cache with the Cache API. This would mean that the resources would be cached double (take this with a grain of salt), by storing it in both the browser and the service worker cache. Additionally, while the Cache API provides more control, most handlers can be (sort of) simulated with http cache:
Network only: Cache-Control: no-store
Cache only: Cache-Control: immutable
Cache first: Cache-Control: max-age with validation (Etag, Last Modified, ...)
Stale-while-revalidate: Cache-Control: stale-while-revalidate
I don't immediately see how to simulate network first, but then again this would imply support for offline usage (or bad connection). (Keep in mind, this is not the use case I'm looking for).
While it's always useful to provide a fallback (using service workers & Cache API), is it worth having the resources possibly cached double and having copied the browser's caching logic? I'm aware that the Cache API can be used to precache resources, but I think these could also be precached by requesting them in advance.
Lastly, I know the browser is in charge of managing the browser cache and a developer has limited control over it (using HTTP Cache headers).
But the browser could also choose to remove the whole service worker cache to clear disk space. There are ways to make sure the cache persists, but that's not the point here.
My questions are:
What advantages has the Cache API that can't be simulated with regular browser cache?
What could be cached with the Cache API, but not with regular browser cache?
Is there another way to create a service worker that does not need to be updated
What advantages has the Cache API that can't be simulated with regular browser cache?
CacheAPI have been created to be manipulated by Service Worker, so you can do nearly what you want with it, be you can't interfere or do anything to HTTP cache, it's all in browser mechanic, i'm not sure but HTTP cache is completly wreck when your offline, not CacheAPI.
What could be cached with the Cache API, but not with regular browser cache?
Before caching request, you can alter request to fit your need, or even cache response with Cache-Control: 0 if you want. Even store custom data that will need after.
Is there another way to create a service worker that does not need to be updated
It need a bit of work, be two of solution to achieve that is :
On each page call you communicate with SW using postMessage to compare version (it can be an id, a hash or even whole list of assets), it's different, you can load list of ressource from given location then add it to cache. (Due to javascript use, this won't work if you have to make it work from AMP)
Each time a user load a page, or each 10/20min ( or both, whatever you want ), you call a files to know your assert version, if it's different, you do the same thing on the other solution.
Hope I help

Best way to initialize initial connection with a server for REST calls?

I've been building some apps that connect to a SQL backend. I use ajax calls to hit WebMethods, a WebAPI, etc.
I notice that the first initial call to the SQL backend retrieves the data fairly slow. I can only assume that this is because it must first negotiate credentials first before retrieving the data. It probably caches this somewhere, and thus, any calls made afterwards come back very fast.
I'm wondering if there's an ideal, or optimal way, to initialize this connection.
My thought was to make a simple GET call right when the page loads (grabbing something very small, like a single entry). I probably wouldn't be using the returned data in any useful way, other than to ensure that any calls afterwards come back faster.
Is this an okay way to approach fixing the initial delay? I'd love to hear how others handle this.
Cheers!
There are a number of reasons that your first call could be slower than subsequent ones
Depending on your server platform, code may be compiled when first executed
You may not have an active DB connection in your connection pool
The database may not have cached indices or data on the first call
Some VM platforms may take a while to allocate sufficient resources to your server if it has been idle for a while.
One way I deal with those types of issues on the server side is to add startup code to my web service that fetches data likely to be used by many callers when the service first initializes (e.g. lookup tables, user credential tables, etc).
If you only control the client, consider that you may well wish to monitor server health (I use the open source monitoring platform Zabbix. There are also many commercial web-based monitoring solutions). Exercising the server outside of end-user code is probably better than making an extra GET call from a page that an end user has loaded.

How can I cache the data in Meteor?

thanks everyone!
recently i want to built a small cms on meteor,but have some question
1,cache,page cache,data cache,etc..
For example,when people search some article
in server side:
Meteor.publist('articles',function(keyword){
return Articles.find({keyword:keyword});
});
in client:
Meteor.subscribe('articles',keyword);
that's ok ,but ......
the question is ,everytime people doing so ,it invoke a mongo query,and reduce the performance,
in other framework use common http or https,people can depend on something like squid or varnish to cache the page or data,so everytime you route to a url,you read data from the cache server ,but Meteor built on socket.js or websocket,and I don't know how to cache throught the socket.......I trid varnish ,but seen no effect.
so,may be it ignore the websocket?is there some method to cache the data,in the mongodb,in server,can i add some cache server ?
2, chat
I see the chatroom example in https://github.com/zquestz/simplechat
But unlike implyment using socket.js,this example save the chat message in the mongodb ,so the data flow is message ->mongo->query->people,this invoke the mongo query too!
and in socket.js,just save the socket in the context(or the server side cache),so the data don't go throught the db.
My question is , is there a socket interface in Meteor ,so I can message->socket->people? and if can't , how is the performace in the productive envirment as the chatroom example doing(i see it runs slow ...)
With Meteor, you don't have to worry about caching Mongodb queries. Meteor does that for you. Per the docs on data and security:
Every Meteor client includes an in-memory database cache. To manage the client cache, the server publishes sets of JSON documents, and the client subscribes to those sets. As documents in a set change, the server patches each client's cache.
[...]
Once subscribed, the client uses its cache as a fast local database, dramatically simplifying client code. Reads never require a costly round trip to the server. And they're limited to the contents of the cache: a query for every document in a collection on a client will only return documents the server is publishing to that client.
Because Meteor does poll the server every so often to see if the client's cache needs patching, you're probably seeing those polls happening every now and then. But they probably aren't very large requests. Additionally, due to a feature of Meteor called latency compensation, when you update a data source, the client immediately reflects the change without first waiting on the server. This reduces the appearance of performance reduction to the user.
If you have many documents in mongo, you may also be seeing them all get fetched if you still have the autopublish package enabled. You can fix that by removing it with meteor remove autopublish and write code to only publish the relevant data instead of the entire database.
If you really need to manage caching manually, the docs also go into that:
Sophisticated clients can turn subscriptions on and off to control how much data is kept in the cache and manage network traffic. When a subscription is turned off, all its documents are removed from the cache unless the same document is also provided by another active subscription.
Additional performance improvements to Meteor are currently being worked on, including a DDP-level proxy to support "very large number of clients". You can see more detail on this at the Meteor roadmap.
If you stumble upon this question not because of a lack of understanding of meteor's minimongo and are instead interested in how to cache subscriptions after they are no longer needed for the moment (but they maybe in the future and don't want to keep their extra DDP overhead on client server) there are two package options:
https://github.com/ccorcos/meteor-subs-cache
https://github.com/kadirahq/subs-manager
I was creating a mobile app and cache of database was not working hence I used GroundDB package of meteor https://github.com/raix/Meteor-GroundDB now the database is always in local whenever I restart the app,
Also you need to look in appcache package of meteor to cache the entire app locally.

Recommend cache updating strategy

Our site is divided into several smaller sites recently, which are then distributed in different IDCs.
One of these sites serves user authentication and other user-related services, the other sites access it through web services.
On every site that fetches data remotely, we make a local cache so that we don't have to go remote every time user information is needed.
What cache updating strategy would you recommend to ensure data integrity?
Since you need the updated-policy close to realtime, you definitely need the cache-invalidation notification engine.
There are 2 possible implementation models for it:
1.Pull
Main server pulls child-servers with notification messages like "resourceID=34392 not more valid in your cache".
This message should be sent on each data update on main server.
Poll
Each child-server ask main server about the cache item validity right before serving it to user.
Ofcourse, in this case, main server should keep the list of objects updated during last cache-lifetime period, and respond to "If-object-was-updated" requests very quickly.
As you see in both cases, your main server should trigger an event on each data change.
In first case this event will be transferred via 'notification bus' to child server, and in second case this event will be stored in recently-updated-objects list.
So both options need some code changes on main server.
As for me the second options is much more easy to implement in common, but it`s very depends of the software stack you're using.

Resources