Stop Service Worker Fetch Event caching specific files - caching

I'm listening for fetch events in a Service Worker and would like to know when is the response to a fetch cached?
1) When the Fetch Event is triggered? PreventDefault() to stop it?
2) Do I need cache: no-store on my INIT parameter to my FETCH call?
3) Does it happen at event.respondWith() time?
I have a data delivery strategy that seeks to deliver data as fresh as possible and only go to cache if the network is slow. BUT even then leave the network fetch running to update the cache whenever it finishes.
Without something like INIT "no-cache" aren't we at risk of the FETCH just reading cache anyway?
I (wrongly?) assumed caching was opt-in and I had to manually update the cache with a PUT().

From https://slightlyoff.github.io/ServiceWorker/spec/service_worker/index.html#cache-lifetimes:
The Cache instances are not part of the browser’s HTTP cache
Don't confuse the HTTP cache with the caches of the Cache API.
The HTTP cache lives between the service worker and the network, so,
if you want the browser not to cache (in the HTTP cache) the response, you have to use the 'no-store' option when fetching the resource.
The other caches instead are completely directly under your control, so if you don't store anything in them, you don't get anything from them.

Spec says this should not be happening: -
https://slightlyoff.github.io/ServiceWorker/spec/service_worker/index.html#cache-lifetimes
5.2. Understanding Cache Lifetimes
The Cache instances are not part of the browser’s HTTP cache. The Cache objects are exactly what authors have to manage themselves. The Cache objects do not get updated unless authors explicitly request them to be. The Cache objects do not expire unless authors delete the entries. The Cache objects do not disappear just because the service worker script is updated. That is, caches are not updated automatically. Updates must be manually managed. This implies that authors should version their caches by name and make sure to use the caches only from the version of the service worker that can safely operate on.

Related

Which does stale-while-revalidate cache strategy mean?

I am trying to implement different cache strategies using ServiceWorker. For the following strategies the way to implement is completely clear:
Cache first
Cache only
Network first
Network only
For example, while trying to implement the cache-first strategy, in the fetch hook of the service-worker I will first ask the CacheStorage (or any other) for the requested URL and then if exists respondWith it and if not respondWith the result of network request.
But for the stale-while-revalidate strategy according to this definition of the workbox, I have the following questions:
First about the mechanism itself. Does stale-while-revalidate mean that use cache until the network responses and then use the network data or just use the network response to renew your cache data for the next time?
Now if the network is cached for the next time, then what scenarios contain a real use-case of that?
And if the network response should be replaced immediately in the app, so how could it be done in a service worker? Because the hook will be resolved with the cached data and then network data could not be resolved (with respondWith).
Yes, it means exactly that. The idea is simple: respond immediately from the cache, then refresh the cache in the background for the next time.
All scenarios where it is not important to always get the very latest version of the page/app =) I'm using stale-while-revalidate strategy on two different web applications, one for public transportation services and one for displaying restaurant menu information. Many sites/apps are just fine with this but of course not all.
One very important thing to note here on the #2:
You could eg. use stale-while-revalidate only for static assets. This way your html, js, css, images etc. would be cached and quickly served to the user, but the data fetched dynamically from an API could still be fresh. For some apps this works, for some others not so well. Depends completely on the app. Of course you have to remember not to change the semantics of your API if the user is running a previous version of the app etc.
Not possible in any automatic way. What you could do, however, is implement a msg channel between the Service Worker and the "regular JS code on the page" using window.postMessage API. You could listen for certain messages on the page and then, from the Service Worker, send a msg when an important change has happened and the cache has been updated. Then you could either show the user a prompt telling that the page really needs to be reloaded right now or even force reload it from JS. You would need to put this logic of determining when an important update has happened into the Service Worker of course.

The difference between browser cache and ServiceWorker cache

I do not understand the difference between browser cache and ServiceWorker cache.
For example, in browser cache, set the expiration cache for all resources. In this way, the HEAD should not verify within the time limit. In other words, you should be able to acquire resources in an offline state because you do not query the server.
On the other hand, if you set cache priority in ServiceWorker, you can acquire resources in the offline state after the second time.
"Both browser cache and ServiceWorker cache can get resources in the offline state"
Is it good to understand that?
I think by "browser cache" you mean the http cache. This is an opportunistic cache of responses across the entire browser. (Mostly, that is. In some browsers its isolated by the top level tab origin.) The browser can evict responses from http cache at any time. It makes no guarantees that data will be present in the http cache at any time. Generally, though, it uses an LRU-based heuristic to age older unused data out. Sites can influence what is stored in http cache using cache-control headers.
In contrast, the Cache API used in service workers is more like IndexedDB. While it stores responses just like http cache it differs in that the site is fully in control. There is an explicitly API for storing and retrieving data. The browser guarantees Cache API data will not be deleted unless the site does it themselves or the entire origin is evicted via the quota mechanism. The Cache API is also much more precisely specified in terms of its behavior compared to http cache. The only way to use the Cache API data during loading, though, is via a ServiceWorker that matches a request using Cache API and then returns the Response to the FetchEvent.respondWith().
Note a ServiceWorker can end up interacting with both of these systems. It can explicitly use the Cache API. It can also pull from http cache when it calls fetch().

From a purely caching point of view, is there any advantage using the new Cache API instead of regular http cache?

The arrival of service workers has led to a great number of improvements to the web. There are many use cases for service workers.
However, from a purely caching point of view, does it makes sense to use the Cache API?
Many approaches make assumptions of how resources will be handled.
Often only the URL is used to determine how the resource should be handled with strategies such as Network first, Network Only, Stale-while-revalidate, Cache first and Cache only. This can be tedious work, because you have to define a specific handler for many URLs. It's not scalable.
Instead I was thinking of using regular HTTP cache in combination with the Cache API. Response headers contain useful information that can be used to cache and verify if the cache can still be used or if a new version would be available. Together with best practice caching (for example https://jakearchibald.com/2016/caching-best-practices/), this could create a generic service worker that has not te be updated when resources change.
Based on the response headers, a resource could be handled by a custom handler. If the headers would ever be updated, it would be possible to handle the resource with a different handler if necessary.
But then I realised, I was just reimplementing browser cache with the Cache API. This would mean that the resources would be cached double (take this with a grain of salt), by storing it in both the browser and the service worker cache. Additionally, while the Cache API provides more control, most handlers can be (sort of) simulated with http cache:
Network only: Cache-Control: no-store
Cache only: Cache-Control: immutable
Cache first: Cache-Control: max-age with validation (Etag, Last Modified, ...)
Stale-while-revalidate: Cache-Control: stale-while-revalidate
I don't immediately see how to simulate network first, but then again this would imply support for offline usage (or bad connection). (Keep in mind, this is not the use case I'm looking for).
While it's always useful to provide a fallback (using service workers & Cache API), is it worth having the resources possibly cached double and having copied the browser's caching logic? I'm aware that the Cache API can be used to precache resources, but I think these could also be precached by requesting them in advance.
Lastly, I know the browser is in charge of managing the browser cache and a developer has limited control over it (using HTTP Cache headers).
But the browser could also choose to remove the whole service worker cache to clear disk space. There are ways to make sure the cache persists, but that's not the point here.
My questions are:
What advantages has the Cache API that can't be simulated with regular browser cache?
What could be cached with the Cache API, but not with regular browser cache?
Is there another way to create a service worker that does not need to be updated
What advantages has the Cache API that can't be simulated with regular browser cache?
CacheAPI have been created to be manipulated by Service Worker, so you can do nearly what you want with it, be you can't interfere or do anything to HTTP cache, it's all in browser mechanic, i'm not sure but HTTP cache is completly wreck when your offline, not CacheAPI.
What could be cached with the Cache API, but not with regular browser cache?
Before caching request, you can alter request to fit your need, or even cache response with Cache-Control: 0 if you want. Even store custom data that will need after.
Is there another way to create a service worker that does not need to be updated
It need a bit of work, be two of solution to achieve that is :
On each page call you communicate with SW using postMessage to compare version (it can be an id, a hash or even whole list of assets), it's different, you can load list of ressource from given location then add it to cache. (Due to javascript use, this won't work if you have to make it work from AMP)
Each time a user load a page, or each 10/20min ( or both, whatever you want ), you call a files to know your assert version, if it's different, you do the same thing on the other solution.
Hope I help

How can I cache network data in service worker?

I have created Progressive Web Application (PWA) with angular 5.0 and .net core 2.0. It works fine in offline mode. But only static data are cached for offline mode. I need to store previously requested network data in service worker cache, so that I can fetch these data through service worker cache in offline mode.
You can use also angular service worker for it.
Data Groups - Cache External API Data
The data groups config allows you to cache external API calls, which makes it possible for your app to use an external data source without a network connection. This data is not known at build-time, so it can only be cached at runtime. There are two possible strategies for caching data sources - freshness and performance.
api-freshness - This freshness strategy will attempt to serve data from the network first, then fallback to th cache. You can set a maxAge property that defines how long to cache responses and a timeout that defines how long to wait before falling back to the cache.
api-performance - The performance cache will serve data from the cache first and only reach out to the network if the cache is expired.
Example you could find here in section ngsw-config.json.
Try to check HTTP Caching.
All you need to do is ensure that each server response provides the
correct HTTP header directives to instruct the browser on when and for
how long the browser can cache the response.
For further info, you can check the whole documentation. It provides example and illustrations to understand better about HTTP Caching.

Akamai caching stratagy

We started caching our static pages in akamai with user defined ttl (for example 7 days). We want control over caching so at 7th day we will purge this cache and recreate by curling all cached pages.
The issue is as akamai serves pages from geographically near node hence there is no control/validation for cache creation. My question is,
A. How can I ensure purge happens in all nodes
B. How can I ensure while curling urls, cache is updated in all nodes.
C. Is there any better way of controlling cache in akamai?
From what I know if you've configured a TTL in Akamai cache, the elements in cache become stale after the defined period & when a request comes on to that node once after the cache has become stale, it will hit the origin / its parent node (if the server is a child) to refresh the stale content. You don't have to explicitly CURL a URL to refresh it. Alternately if you want to forcibly refresh a cache, you can use Akamai APIs or the Edgesuite interface to manually refresh a cache.

Resources