Akamai caching stratagy - caching

We started caching our static pages in akamai with user defined ttl (for example 7 days). We want control over caching so at 7th day we will purge this cache and recreate by curling all cached pages.
The issue is as akamai serves pages from geographically near node hence there is no control/validation for cache creation. My question is,
A. How can I ensure purge happens in all nodes
B. How can I ensure while curling urls, cache is updated in all nodes.
C. Is there any better way of controlling cache in akamai?

From what I know if you've configured a TTL in Akamai cache, the elements in cache become stale after the defined period & when a request comes on to that node once after the cache has become stale, it will hit the origin / its parent node (if the server is a child) to refresh the stale content. You don't have to explicitly CURL a URL to refresh it. Alternately if you want to forcibly refresh a cache, you can use Akamai APIs or the Edgesuite interface to manually refresh a cache.

Related

TYPO3 caching issue in clustered deployment

I have an issue with typo3 and caching.
We have done the following setup:
1 Nginx load balancer (ip_hash i.e sticky sessions)
2 TYPO3 web instances
1 redis cache shared by both typo3 instances
The issue is that when the first web servers serves a given page, it gets cached. As long as the same web server is serving that page, the cached version gets returned.
As soon as the page request is served by the other web served, the full cache get reloaded.
I noticed that additional items are added to the cache although the page content has not changed.
Is there anything I could check to avoid these unnecessary cache reloads?
There are some considerations before scaling TYPO3 horizontally: https://stackoverflow.com/a/63594837/2819581
Basically, database/caches and some directories all carry state which is not independent of each other.

The difference between browser cache and ServiceWorker cache

I do not understand the difference between browser cache and ServiceWorker cache.
For example, in browser cache, set the expiration cache for all resources. In this way, the HEAD should not verify within the time limit. In other words, you should be able to acquire resources in an offline state because you do not query the server.
On the other hand, if you set cache priority in ServiceWorker, you can acquire resources in the offline state after the second time.
"Both browser cache and ServiceWorker cache can get resources in the offline state"
Is it good to understand that?
I think by "browser cache" you mean the http cache. This is an opportunistic cache of responses across the entire browser. (Mostly, that is. In some browsers its isolated by the top level tab origin.) The browser can evict responses from http cache at any time. It makes no guarantees that data will be present in the http cache at any time. Generally, though, it uses an LRU-based heuristic to age older unused data out. Sites can influence what is stored in http cache using cache-control headers.
In contrast, the Cache API used in service workers is more like IndexedDB. While it stores responses just like http cache it differs in that the site is fully in control. There is an explicitly API for storing and retrieving data. The browser guarantees Cache API data will not be deleted unless the site does it themselves or the entire origin is evicted via the quota mechanism. The Cache API is also much more precisely specified in terms of its behavior compared to http cache. The only way to use the Cache API data during loading, though, is via a ServiceWorker that matches a request using Cache API and then returns the Response to the FetchEvent.respondWith().
Note a ServiceWorker can end up interacting with both of these systems. It can explicitly use the Cache API. It can also pull from http cache when it calls fetch().

From a purely caching point of view, is there any advantage using the new Cache API instead of regular http cache?

The arrival of service workers has led to a great number of improvements to the web. There are many use cases for service workers.
However, from a purely caching point of view, does it makes sense to use the Cache API?
Many approaches make assumptions of how resources will be handled.
Often only the URL is used to determine how the resource should be handled with strategies such as Network first, Network Only, Stale-while-revalidate, Cache first and Cache only. This can be tedious work, because you have to define a specific handler for many URLs. It's not scalable.
Instead I was thinking of using regular HTTP cache in combination with the Cache API. Response headers contain useful information that can be used to cache and verify if the cache can still be used or if a new version would be available. Together with best practice caching (for example https://jakearchibald.com/2016/caching-best-practices/), this could create a generic service worker that has not te be updated when resources change.
Based on the response headers, a resource could be handled by a custom handler. If the headers would ever be updated, it would be possible to handle the resource with a different handler if necessary.
But then I realised, I was just reimplementing browser cache with the Cache API. This would mean that the resources would be cached double (take this with a grain of salt), by storing it in both the browser and the service worker cache. Additionally, while the Cache API provides more control, most handlers can be (sort of) simulated with http cache:
Network only: Cache-Control: no-store
Cache only: Cache-Control: immutable
Cache first: Cache-Control: max-age with validation (Etag, Last Modified, ...)
Stale-while-revalidate: Cache-Control: stale-while-revalidate
I don't immediately see how to simulate network first, but then again this would imply support for offline usage (or bad connection). (Keep in mind, this is not the use case I'm looking for).
While it's always useful to provide a fallback (using service workers & Cache API), is it worth having the resources possibly cached double and having copied the browser's caching logic? I'm aware that the Cache API can be used to precache resources, but I think these could also be precached by requesting them in advance.
Lastly, I know the browser is in charge of managing the browser cache and a developer has limited control over it (using HTTP Cache headers).
But the browser could also choose to remove the whole service worker cache to clear disk space. There are ways to make sure the cache persists, but that's not the point here.
My questions are:
What advantages has the Cache API that can't be simulated with regular browser cache?
What could be cached with the Cache API, but not with regular browser cache?
Is there another way to create a service worker that does not need to be updated
What advantages has the Cache API that can't be simulated with regular browser cache?
CacheAPI have been created to be manipulated by Service Worker, so you can do nearly what you want with it, be you can't interfere or do anything to HTTP cache, it's all in browser mechanic, i'm not sure but HTTP cache is completly wreck when your offline, not CacheAPI.
What could be cached with the Cache API, but not with regular browser cache?
Before caching request, you can alter request to fit your need, or even cache response with Cache-Control: 0 if you want. Even store custom data that will need after.
Is there another way to create a service worker that does not need to be updated
It need a bit of work, be two of solution to achieve that is :
On each page call you communicate with SW using postMessage to compare version (it can be an id, a hash or even whole list of assets), it's different, you can load list of ressource from given location then add it to cache. (Due to javascript use, this won't work if you have to make it work from AMP)
Each time a user load a page, or each 10/20min ( or both, whatever you want ), you call a files to know your assert version, if it's different, you do the same thing on the other solution.
Hope I help

Stop Service Worker Fetch Event caching specific files

I'm listening for fetch events in a Service Worker and would like to know when is the response to a fetch cached?
1) When the Fetch Event is triggered? PreventDefault() to stop it?
2) Do I need cache: no-store on my INIT parameter to my FETCH call?
3) Does it happen at event.respondWith() time?
I have a data delivery strategy that seeks to deliver data as fresh as possible and only go to cache if the network is slow. BUT even then leave the network fetch running to update the cache whenever it finishes.
Without something like INIT "no-cache" aren't we at risk of the FETCH just reading cache anyway?
I (wrongly?) assumed caching was opt-in and I had to manually update the cache with a PUT().
From https://slightlyoff.github.io/ServiceWorker/spec/service_worker/index.html#cache-lifetimes:
The Cache instances are not part of the browser’s HTTP cache
Don't confuse the HTTP cache with the caches of the Cache API.
The HTTP cache lives between the service worker and the network, so,
if you want the browser not to cache (in the HTTP cache) the response, you have to use the 'no-store' option when fetching the resource.
The other caches instead are completely directly under your control, so if you don't store anything in them, you don't get anything from them.
Spec says this should not be happening: -
https://slightlyoff.github.io/ServiceWorker/spec/service_worker/index.html#cache-lifetimes
5.2. Understanding Cache Lifetimes
The Cache instances are not part of the browser’s HTTP cache. The Cache objects are exactly what authors have to manage themselves. The Cache objects do not get updated unless authors explicitly request them to be. The Cache objects do not expire unless authors delete the entries. The Cache objects do not disappear just because the service worker script is updated. That is, caches are not updated automatically. Updates must be manually managed. This implies that authors should version their caches by name and make sure to use the caches only from the version of the service worker that can safely operate on.

ARR V2.0 Memory Cache - Html and Image cached objects never hit

i am trying on a caching solution through ARR V2 on my local machine. I have a single node server farm pointing to the oringal server. And all requests coming in to my local machine will be filtered through an inbound rule to be directed to that server farm.
In the ARR Cache action panel I have added a Cache Control Rules to always cache all objects for 20 mins for all requests directed to the server farm. Here I suppose I have overwritten the Memory cache duration settings in the server farm's caching action panel. (Correct me if I'm wrong)
Some other setting on IIS includes:
Output caching is disabled for both user mode and kernel mode (a bit weired to me that objects are still cached in kernel cache)
Server proxy setting is disabled in ARR Cache action panel (I suppose this won't affect because server farm is also proxy)
After complete all the settings, I can see that All objects are cached in my disk cache when I triggered a web request. And through "netsh http show cache" I can also observe that they are all cached in kernal cache( ARR memory cache).
However, when I trigger the web request again, although the request is not going to the original server, request to html and images don't hit the memory cache and through "netsh http show cache" I can see that their hit count remain '1' and their TTL refreshed to be in sync with their corresponding disk cached objects. In contrast, css and javascript always hit memory cache till they expire. In my MIME list, they are all treated as static files so I don't understand why they behaves differently here.
Then I cleared disk cached objects (both primary and secondary), and trigger the web request again. Now (note that all objects are still in the memory cache), only html and images are cached in the disk cache while request to css and javascript are still served from the memory cache so they don'y appear in disk cache.
I just start experimenting on ARR don't fully understand the concept. Hope someone could help.
Thanks in advance.

Resources