How to solve cache slamming / stampeding case with Varnish? - caching

I'm considering to use Varnish as a caching solution for our infrastructure and I would like to ask if there a mechanism inside Varnish, which solves cache slamming / stampeding problem?

Since Varnish 4.0, You can serve stale while revalidate using the grace time (https://info.varnish-software.com/blog/grace-varnish-4-stale-while-revalidate-semantics-varnish ).
Varnish will asynchronously fetch the response from the backend and serve the stale cached content if it is in its grace period.

Related

How to serve from Cloudflare cache without requesting the Cloudflare Worker?

I have a Cloudflare Worker, whose responses can be cached for a long time. I know I can use Cache API inside the worker, but I want the request to never reach the worker at all, if Cache TTL is not reached.
There will be more than 10 million requests to this url, and I don't see the point paying for a Worker, that most of the time will just fetch a response from the Cache API.
I know a workaround - just host the worker code on a server, and use Page Rules, to cache everything from this origin. But I'm wondering if I could use Worker as origin, and somehow make Page Rules work with it. Just setting a Page Rule to cache everythig and cache TTL setting to 1 month still routes all requests to the Worker and doesn't cache anything.
There's currently no way to do this.
It's important to understand that this is really a pricing question, not a technical question. Cloudflare has chosen to price Workers based on the traffic level of a site that is served using Workers. This pricing decision isn't necessarily based on Cloudflare's costs, and Cloudflare's costs wouldn't necessarily be lower if your Worker runs less often (since the cost of deployment would not change, and the cost of executing a worker is quite low), so it doesn't necessarily make sense for Cloudflare to offer a discount for Worker-based sites that manage to serve most responses from cache.
With that said, Cloudflare could very well decide to offer this discount in the future for competitive or other reasons. But, at this time, there are no plans for this.
There's a longer explanation on the Cloudflare forums: https://community.cloudflare.com/t/cache-in-front-of-worker/171258/8

TYPO3 caching issue in clustered deployment

I have an issue with typo3 and caching.
We have done the following setup:
1 Nginx load balancer (ip_hash i.e sticky sessions)
2 TYPO3 web instances
1 redis cache shared by both typo3 instances
The issue is that when the first web servers serves a given page, it gets cached. As long as the same web server is serving that page, the cached version gets returned.
As soon as the page request is served by the other web served, the full cache get reloaded.
I noticed that additional items are added to the cache although the page content has not changed.
Is there anything I could check to avoid these unnecessary cache reloads?
There are some considerations before scaling TYPO3 horizontally: https://stackoverflow.com/a/63594837/2819581
Basically, database/caches and some directories all carry state which is not independent of each other.

The difference between browser cache and ServiceWorker cache

I do not understand the difference between browser cache and ServiceWorker cache.
For example, in browser cache, set the expiration cache for all resources. In this way, the HEAD should not verify within the time limit. In other words, you should be able to acquire resources in an offline state because you do not query the server.
On the other hand, if you set cache priority in ServiceWorker, you can acquire resources in the offline state after the second time.
"Both browser cache and ServiceWorker cache can get resources in the offline state"
Is it good to understand that?
I think by "browser cache" you mean the http cache. This is an opportunistic cache of responses across the entire browser. (Mostly, that is. In some browsers its isolated by the top level tab origin.) The browser can evict responses from http cache at any time. It makes no guarantees that data will be present in the http cache at any time. Generally, though, it uses an LRU-based heuristic to age older unused data out. Sites can influence what is stored in http cache using cache-control headers.
In contrast, the Cache API used in service workers is more like IndexedDB. While it stores responses just like http cache it differs in that the site is fully in control. There is an explicitly API for storing and retrieving data. The browser guarantees Cache API data will not be deleted unless the site does it themselves or the entire origin is evicted via the quota mechanism. The Cache API is also much more precisely specified in terms of its behavior compared to http cache. The only way to use the Cache API data during loading, though, is via a ServiceWorker that matches a request using Cache API and then returns the Response to the FetchEvent.respondWith().
Note a ServiceWorker can end up interacting with both of these systems. It can explicitly use the Cache API. It can also pull from http cache when it calls fetch().

Akamai caching stratagy

We started caching our static pages in akamai with user defined ttl (for example 7 days). We want control over caching so at 7th day we will purge this cache and recreate by curling all cached pages.
The issue is as akamai serves pages from geographically near node hence there is no control/validation for cache creation. My question is,
A. How can I ensure purge happens in all nodes
B. How can I ensure while curling urls, cache is updated in all nodes.
C. Is there any better way of controlling cache in akamai?
From what I know if you've configured a TTL in Akamai cache, the elements in cache become stale after the defined period & when a request comes on to that node once after the cache has become stale, it will hit the origin / its parent node (if the server is a child) to refresh the stale content. You don't have to explicitly CURL a URL to refresh it. Alternately if you want to forcibly refresh a cache, you can use Akamai APIs or the Edgesuite interface to manually refresh a cache.

How can I configure Varnish to cache range requests?

I'm trying to configure Varnish to cache range requests. I notice the http_range_support option, but everything I've read says that this will attempt to cache the entire file before satisfying the request. Is it possible to do so without requiring the entire file already be cached?
Depends on Varnish version,
From Varnish 3.0.2 you can stream uncached content while it caches the full object.
https://www.varnish-software.com/blog/http-streaming-varnish
"Basically, his code lifts the limitations of the 3.0 release and allows Varnish to deliver the objects, while they are being fetched, to multiple clients."
The feature will be available on beresp.do_stream
https://www.varnish-software.com/blog/streaming-varnish-30

Resources