I am trying to get some html pages to be cached, the same way images are automatically cached via CloudFlare but I can't get CloudFlare to actually hits its cache for html.
According to the documentation (Ref: https://support.cloudflare.com/hc/en-us/articles/202775670-How-Do-I-Tell-CloudFlare-What-to-Cache-), it's possible to cache anything with a Cache-Control set to public with a max-age greater than 0.
I've tried various combinations of headers on my origin Nginx server without success. From a simple Cache-Control: public, max-age=31536000 to more complex headers including s-maxage=31536000, Pragma: public, ETag: "569ff137-6", Expires: Thu, 31 Dec 2037 23:55:55 GMT without any results.
Any ideas to force CloudFlare to serve the html pages from their cache?
PS: I am getting the CF-Cache-Status: HIT on the images and it works fine but on the html pages nothing, not even CF-Cache-Status: something. With a CloudFlare page rule for html pages, it seems to work fine but I want to avoid using one, mainly because it's too CloudFlare specific. I am not serving cookies or anything dynamic from these pages.
It is now possible to get Cloudflare to respect your web servers headers instead of overriding them with the minimum described in the Browser Cache TTL setting.
Firstly navigate to the Caching tab in the Cloudflare dashboard:
From here you can scroll down to the "Browser Cache Expiration" setting, from here you can select the "Respect Existing Headers" option in the dropdown:
Further reading:
Does CloudFlare honor my Expires and Cache-Control headers for static content?
Caching Anonymous Page Views
How do I cache static HTML?
Note: If this setting isn't chosen, Cloudflare will apply a default 4 hour minimum to Cache-Control headers. Once this setting is set, Cloudflare will not touch your Cache-Control headers (even if they're low or not at all set).
I stumbled on this too. From the page it says
Pro Tip: Sending cache directives from your origin for resources with extensions we don't cache by default will make absolutely no difference. To specify the caching duration at your origin for the extensions we don't cache by default, you'd have to create a Page Rule to "Cache Everything".
So it appears that you do have to set a page rule to use this for files that CloudFlare doesn't cache by default. This page describes this in more detail,
https://blog.cloudflare.com/edge-cache-expire-ttl-easiest-way-to-override/
That said it still didn't work for me and appears not to be supported. After contacting their support they confirmed this. Respect Origin Header has been removed from all plan types. So if you have no page rules they will respect the origin header.
This doesn't help for hitting their edge cache for html pages however. To that you have set up a page rule. Once that is done you can, I believe, set your max-age as low as your plan allows. Any lower and it gets over-written. That is to say, with no page rule you could say Cache-Control: max-age:30 and it would pass through. With a page rule that include edge caching your max-age then becomes subject to the minimum time your plan allows even if the page rule doesn't specify browser cache.
The CF documentation is very unclear. Go into "Page Rules", and define a rule that turns on caching, based upon wildcards -- and then it will work.
Related
I am using Google Cloud CDN for my WordPress website https://cdn.datanumen.com. I have enabled "Force Cache All Content" option. However, the web pages, css files, javascript files are still not cached. Only the images are cached.
For example, I test the page at https://cdn.datanumen.com/, I have used Ctrl + F5 to refresh the webpage for many times, but always get the same results.
Below is the web page I try to load:
There are "Cache-Control" field in the response header, but no "Age" field. Based on Google document, if a cache hits and cached content is served, there will be a "Age" field. So without "Age" means the file is not cached.
I also check the log:
In the log, cacheFillBytes is 26776 and cacheLookup is true. It seems that Google CDN is trying to lookup cache and fill cache with the contents. But the statusDetails shows "response_sent_by_backend", so the contents are still served from the backend. Normally this should only occur for the first time when I visit the website. But for my case, even if I press Ctrl + F5 to refresh my website for many times, I will always get the same result, the statusDetails never shows "response_sent_by_cache" for page such as https://cdn.datanumen.com/
Why?
Update:
I notice there is a "Vary" field in the response header:
Based on https://cloud.google.com/cdn/docs/caching#non-cacheable_content, if Vary header Has a value other than Accept, Accept-Encoding, or Origin, then the content will not be cached, since for my case "Vary" header is "Accept-Encoding,Cookie,User-Agent", it is not cached. But my question is how to deal with issue and let the content be cached forcely?
Update 2
I have changed the site to a real WordPress site, since that is what I need finally. I plan to use Google Cloud CDN purchased support to see if they can help on this case.
According to the Google Cloud CDN's documentation, the best way to solve your problem is actually using the CACHE_ALL_STATIC cache mode:
CACHE_ALL_STATIC: Automatically caches static content that doesn't have the no-store or private directive. Origin responses that set valid caching directives are also cached. This is the default behavior for Cloud CDN-enabled backends created by using the gcloud command-line tool or the REST API.
USE_ORIGIN_HEADERS: Requires origin responses to set valid cache directives and valid caching headers. Responses without these directives are forwarded from the origin.
FORCE_CACHE_ALL: Unconditionally caches responses, overring any cache directives set by the origin. This mode is not appropriate if the backend serves private, per-user content, such as dynamic HTML or API responses.
But in the case of the last cache mode, there are two warnings about its usage:
When you set the cache mode to FORCE_CACHE_ALL, the default time to live (TTL) for content caching is 3600 seconds (1 hour), unless you explicitly set a different TTL. Accepting the new default TTL of 1 hour might cause some entries that were previously considered fresh (due to having longer TTLs from origin headers) to now be considered stale.
The FORCE_CACHE_ALL mode overrides cache directives (Cache-Control and Expires) but does not override other origin response headers. In particular, a Vary header is still honored, and may suppress caching even in the presence of FORCE_CACHE_ALL. For more information, see Vary headers.
I have tried everything and I am stumped so I will appreciate some help. I have an image on S3 with a cache-control of 'public, max-age=31536000'. The S3 is set up as a static website and I plan to serve it via Cloudflare. However, Cloudflare does not seem to cache the images. Here is an example URL:
https://media-dev.slpht.net/L32JrnoWRgD4xDYBuaEkYzpC
Inspecting it, I see
cache-control: public, max-age=31536000
cf-cache-status: DYNAMIC
I think if it says DYNAMIC, that means that it is not caching. I am also getting 200 OK every time I refresh the page and I am expecting a 304?
Leaving the question and answering it in case it helps someone. If it does not, please remove this post.
It turns out that I have "Disable cache" on my Chrome browser on. When I turn it off, refreshing the page returns a 304.
I am seeing an old expiry date in headers. This is on a Firefox browser for Magento 2 site using nginx with FPC on. Please see below header. Is this something to be worried about?
Looks like it’s intending this asset not to be cached. Notice the cache-control sets max-age to 0 and must-revalidate and some other directives.
Expiries is only used by older clients that do not understand the cache-control header (which is basically every browser has for the last 10 years) but it doesn’t allow a “do not cache” value so a hack around that is to set an old date to indicate it’s already expired.
So the question for you is: do you want to cache this? If so then yes this is a problem, if not then it is working as intended.
There’s massive performance gains to caching, for when you use the asset again (on a different page, or by coming back to this page) but on the flip side it adds complications if you want to have a new version of the asset.
We have a web application.
Until now we had no real cache handling strategy.
When we had a new version of certain JavaScript files, we instructed our users to clear their browser cache.
Now we want to change this.
Up to this date our starting page was "start_app.html".
In our effort to implement our cache busting strategy we want to ensure that the browser will NOT cache our starting page.
We will change the extension from ".html" into ".php".
It seems that the browser has an array of extensions, when he ALWAYS fetches a fresh copy from the web server, like "php", "asp", and so on.
Is this true and which extensions are these?
Thanks alot in advance
Please don't rely on incorrect browser behavior to not cache your page. Instead, use this header:
Cache-Control: no-cache, no-store
This page has all the details as to why that header will do what you want.
When caching a HTML page with must-revalidate, this means that browser must check for any update defined by Last-Modified or Etag. However, the problem is that before max-age, browser will not make any connection with the website to read HTTP headers (to analyze Last-Modified and Etag)?
How to force the browser to make a brief connection to read (at least) HTTP readers before loading the page from cache?
I do not understand the usage of must-revalidate! Doesn't it its responsibility to check for updates before max-age? because after reaching max-age, browser will read from the website and never use local cache.
Yes, your understanding of must-revalidate is wrong: it says that the cache may not serve this content when it is stale (i.e. "expired"), but must revalidate before that. Yes, caches (and browsers) can in theory be set to serve pages even if they are stale, though the standard says they should warn the user if they do this.
To force the browser to recheck your page with the server, the simplest solution is to add max-age=0 to the Cache-Control header. This will let your browser keep a copy of the page in its cache, but compare it with the server's version by sending the contents of Last-Modified or ETag, as you wanted.
It used to be that you could add no-cache instead, but as users have been expecting this to behave as no-store, browsers are gradually treating them the same.
Check the HTTP/1.1 RFC, section 14.9 for more information on the headers.