How can I configure Varnish to cache range requests? - caching

I'm trying to configure Varnish to cache range requests. I notice the http_range_support option, but everything I've read says that this will attempt to cache the entire file before satisfying the request. Is it possible to do so without requiring the entire file already be cached?

Depends on Varnish version,
From Varnish 3.0.2 you can stream uncached content while it caches the full object.
https://www.varnish-software.com/blog/http-streaming-varnish
"Basically, his code lifts the limitations of the 3.0 release and allows Varnish to deliver the objects, while they are being fetched, to multiple clients."
The feature will be available on beresp.do_stream
https://www.varnish-software.com/blog/streaming-varnish-30

Related

The difference between browser cache and ServiceWorker cache

I do not understand the difference between browser cache and ServiceWorker cache.
For example, in browser cache, set the expiration cache for all resources. In this way, the HEAD should not verify within the time limit. In other words, you should be able to acquire resources in an offline state because you do not query the server.
On the other hand, if you set cache priority in ServiceWorker, you can acquire resources in the offline state after the second time.
"Both browser cache and ServiceWorker cache can get resources in the offline state"
Is it good to understand that?
I think by "browser cache" you mean the http cache. This is an opportunistic cache of responses across the entire browser. (Mostly, that is. In some browsers its isolated by the top level tab origin.) The browser can evict responses from http cache at any time. It makes no guarantees that data will be present in the http cache at any time. Generally, though, it uses an LRU-based heuristic to age older unused data out. Sites can influence what is stored in http cache using cache-control headers.
In contrast, the Cache API used in service workers is more like IndexedDB. While it stores responses just like http cache it differs in that the site is fully in control. There is an explicitly API for storing and retrieving data. The browser guarantees Cache API data will not be deleted unless the site does it themselves or the entire origin is evicted via the quota mechanism. The Cache API is also much more precisely specified in terms of its behavior compared to http cache. The only way to use the Cache API data during loading, though, is via a ServiceWorker that matches a request using Cache API and then returns the Response to the FetchEvent.respondWith().
Note a ServiceWorker can end up interacting with both of these systems. It can explicitly use the Cache API. It can also pull from http cache when it calls fetch().

From a purely caching point of view, is there any advantage using the new Cache API instead of regular http cache?

The arrival of service workers has led to a great number of improvements to the web. There are many use cases for service workers.
However, from a purely caching point of view, does it makes sense to use the Cache API?
Many approaches make assumptions of how resources will be handled.
Often only the URL is used to determine how the resource should be handled with strategies such as Network first, Network Only, Stale-while-revalidate, Cache first and Cache only. This can be tedious work, because you have to define a specific handler for many URLs. It's not scalable.
Instead I was thinking of using regular HTTP cache in combination with the Cache API. Response headers contain useful information that can be used to cache and verify if the cache can still be used or if a new version would be available. Together with best practice caching (for example https://jakearchibald.com/2016/caching-best-practices/), this could create a generic service worker that has not te be updated when resources change.
Based on the response headers, a resource could be handled by a custom handler. If the headers would ever be updated, it would be possible to handle the resource with a different handler if necessary.
But then I realised, I was just reimplementing browser cache with the Cache API. This would mean that the resources would be cached double (take this with a grain of salt), by storing it in both the browser and the service worker cache. Additionally, while the Cache API provides more control, most handlers can be (sort of) simulated with http cache:
Network only: Cache-Control: no-store
Cache only: Cache-Control: immutable
Cache first: Cache-Control: max-age with validation (Etag, Last Modified, ...)
Stale-while-revalidate: Cache-Control: stale-while-revalidate
I don't immediately see how to simulate network first, but then again this would imply support for offline usage (or bad connection). (Keep in mind, this is not the use case I'm looking for).
While it's always useful to provide a fallback (using service workers & Cache API), is it worth having the resources possibly cached double and having copied the browser's caching logic? I'm aware that the Cache API can be used to precache resources, but I think these could also be precached by requesting them in advance.
Lastly, I know the browser is in charge of managing the browser cache and a developer has limited control over it (using HTTP Cache headers).
But the browser could also choose to remove the whole service worker cache to clear disk space. There are ways to make sure the cache persists, but that's not the point here.
My questions are:
What advantages has the Cache API that can't be simulated with regular browser cache?
What could be cached with the Cache API, but not with regular browser cache?
Is there another way to create a service worker that does not need to be updated
What advantages has the Cache API that can't be simulated with regular browser cache?
CacheAPI have been created to be manipulated by Service Worker, so you can do nearly what you want with it, be you can't interfere or do anything to HTTP cache, it's all in browser mechanic, i'm not sure but HTTP cache is completly wreck when your offline, not CacheAPI.
What could be cached with the Cache API, but not with regular browser cache?
Before caching request, you can alter request to fit your need, or even cache response with Cache-Control: 0 if you want. Even store custom data that will need after.
Is there another way to create a service worker that does not need to be updated
It need a bit of work, be two of solution to achieve that is :
On each page call you communicate with SW using postMessage to compare version (it can be an id, a hash or even whole list of assets), it's different, you can load list of ressource from given location then add it to cache. (Due to javascript use, this won't work if you have to make it work from AMP)
Each time a user load a page, or each 10/20min ( or both, whatever you want ), you call a files to know your assert version, if it's different, you do the same thing on the other solution.
Hope I help

How can I cache network data in service worker?

I have created Progressive Web Application (PWA) with angular 5.0 and .net core 2.0. It works fine in offline mode. But only static data are cached for offline mode. I need to store previously requested network data in service worker cache, so that I can fetch these data through service worker cache in offline mode.
You can use also angular service worker for it.
Data Groups - Cache External API Data
The data groups config allows you to cache external API calls, which makes it possible for your app to use an external data source without a network connection. This data is not known at build-time, so it can only be cached at runtime. There are two possible strategies for caching data sources - freshness and performance.
api-freshness - This freshness strategy will attempt to serve data from the network first, then fallback to th cache. You can set a maxAge property that defines how long to cache responses and a timeout that defines how long to wait before falling back to the cache.
api-performance - The performance cache will serve data from the cache first and only reach out to the network if the cache is expired.
Example you could find here in section ngsw-config.json.
Try to check HTTP Caching.
All you need to do is ensure that each server response provides the
correct HTTP header directives to instruct the browser on when and for
how long the browser can cache the response.
For further info, you can check the whole documentation. It provides example and illustrations to understand better about HTTP Caching.

Stop Service Worker Fetch Event caching specific files

I'm listening for fetch events in a Service Worker and would like to know when is the response to a fetch cached?
1) When the Fetch Event is triggered? PreventDefault() to stop it?
2) Do I need cache: no-store on my INIT parameter to my FETCH call?
3) Does it happen at event.respondWith() time?
I have a data delivery strategy that seeks to deliver data as fresh as possible and only go to cache if the network is slow. BUT even then leave the network fetch running to update the cache whenever it finishes.
Without something like INIT "no-cache" aren't we at risk of the FETCH just reading cache anyway?
I (wrongly?) assumed caching was opt-in and I had to manually update the cache with a PUT().
From https://slightlyoff.github.io/ServiceWorker/spec/service_worker/index.html#cache-lifetimes:
The Cache instances are not part of the browser’s HTTP cache
Don't confuse the HTTP cache with the caches of the Cache API.
The HTTP cache lives between the service worker and the network, so,
if you want the browser not to cache (in the HTTP cache) the response, you have to use the 'no-store' option when fetching the resource.
The other caches instead are completely directly under your control, so if you don't store anything in them, you don't get anything from them.
Spec says this should not be happening: -
https://slightlyoff.github.io/ServiceWorker/spec/service_worker/index.html#cache-lifetimes
5.2. Understanding Cache Lifetimes
The Cache instances are not part of the browser’s HTTP cache. The Cache objects are exactly what authors have to manage themselves. The Cache objects do not get updated unless authors explicitly request them to be. The Cache objects do not expire unless authors delete the entries. The Cache objects do not disappear just because the service worker script is updated. That is, caches are not updated automatically. Updates must be manually managed. This implies that authors should version their caches by name and make sure to use the caches only from the version of the service worker that can safely operate on.

ARR V2.0 Memory Cache - Html and Image cached objects never hit

i am trying on a caching solution through ARR V2 on my local machine. I have a single node server farm pointing to the oringal server. And all requests coming in to my local machine will be filtered through an inbound rule to be directed to that server farm.
In the ARR Cache action panel I have added a Cache Control Rules to always cache all objects for 20 mins for all requests directed to the server farm. Here I suppose I have overwritten the Memory cache duration settings in the server farm's caching action panel. (Correct me if I'm wrong)
Some other setting on IIS includes:
Output caching is disabled for both user mode and kernel mode (a bit weired to me that objects are still cached in kernel cache)
Server proxy setting is disabled in ARR Cache action panel (I suppose this won't affect because server farm is also proxy)
After complete all the settings, I can see that All objects are cached in my disk cache when I triggered a web request. And through "netsh http show cache" I can also observe that they are all cached in kernal cache( ARR memory cache).
However, when I trigger the web request again, although the request is not going to the original server, request to html and images don't hit the memory cache and through "netsh http show cache" I can see that their hit count remain '1' and their TTL refreshed to be in sync with their corresponding disk cached objects. In contrast, css and javascript always hit memory cache till they expire. In my MIME list, they are all treated as static files so I don't understand why they behaves differently here.
Then I cleared disk cached objects (both primary and secondary), and trigger the web request again. Now (note that all objects are still in the memory cache), only html and images are cached in the disk cache while request to css and javascript are still served from the memory cache so they don'y appear in disk cache.
I just start experimenting on ARR don't fully understand the concept. Hope someone could help.
Thanks in advance.

Resources