amp-pixel is lost in AMP Cache - caching

Why is an amp-pixel tag like this lost in the AMP cache?
<amp-pixel src="https://amp.abendzeitung-muenchen.de/_CPiX/art-706352-60/706352.gif?RANDOM" layout="nodisplay" class="i-amphtml-element i-amphtml-layout-nodisplay i-amphtml-built" i-amphtml-layout="nodisplay" aria-hidden="true" hidden=""></amp-pixel>
Here the link to the original website (you can find the amp-pixel tag)
https://amp.abendzeitung-muenchen.de/muenchen/vorlaeufiger-corona-kassensturz-fast-tausend-euro-pro-kopf-verschuldung-in-muenchen-art-706352
and the link to the same site from the AMP Cache (the amp-pixel tag is gone)
https://amp-abendzeitung--muenchen-de.cdn.ampproject.org/c/s/amp.abendzeitung-muenchen.de/muenchen/vorlaeufiger-corona-kassensturz-fast-tausend-euro-pro-kopf-verschuldung-in-muenchen-art-706352

The cache serves a stale version of your page. If you cache-bust the URL, e.g.: https://amp-abendzeitung--muenchen-de.cdn.ampproject.org/c/s/amp.abendzeitung-muenchen.de/muenchen/vorlaeufiger-corona-kassensturz-fast-tausend-euro-pro-kopf-verschuldung-in-muenchen-art-706352?asdfasdfsa you'll get the amp-pixel.
I'd recommend serving AMP pages with a lower max-age to make sure changes get propagated quickly.

Related

template engine and cache

When using a template engine (pug, thymeleaf, etc...),
the server renders an html file dynamically and then delivers it to the client upon each page request.
Suppose there is a company proxy server or a cache server between the server and the client.
will there ever be a cache hit?
don't we lose all the benefits of internet cache when sending new versions of our html to clients all the time?
If the URL is the same for all users then yes, the CDN will be hit most of the time. You will need to do something like cache-control or set up the CDN to bypass the cache when a certain path is hit.
This is why a lot of sites use AJAX calls to fill the pages post-load. All of the HTML can be cached in the CDN and the CDN is configured to bypass cache for all /api paths.
Our site uses CDN for the public pages (which are still generated with pug), then when you sign in the CDN is instructed to never cache the "personal" pages that are rendered dynamically.

Google AMP Cache - hot to force loading index.html from cache?

Is there any way how to force loading main homepage (index.html) to load from AMP Cache?
I have all images loaded from Cache according to manual: https://developers.google.com/amp/cache/overview
But in DevTools audit there is still an error for the homepage (not being served through http/2 - from the cache)
I’m not sure exactly what you mean but think you may be misunderstanding the point of the AMP cache.
The Google AMP Cache is not like a CDN (Content Delivery Network) that always sits in front of your site, though in certain instances it acts like one.
The Google AMP Cache is automatically populated by Google when it crawls your site. Any searches on Google while on mobile will then serve your AMP pages, rather than your normal pages, and will also serve them from the Google AMP cache rather than from your domain. This is done for a number of reasons, but primarily to create the “instant loading” effect that AMP gives when loaded from Google Search results (aka Search Engine Results Page or SERP). In this case the whole page including the index page is served from the Google AMP Cache.
Other sites and domains can also decide to display AMP pages instead of your HTML pages if they want, and can decide to serve them from the Google AMP cache, from their own AMP cache (though, other than Google, only Cloudflare have implemented their own AMP Cache AFAIK) or directly from your home page (in which case there is no cache used). Twitter for example automatically replaces links with their AMP equivalents but loads from the real domain so is fast (due to AMP) but not “instant” (like it is in the Google Search Results).
So you, as a site owner, don’t decide when to use the AMP Cache - the calling application (e.g. Google SERPS, Twitter) decides that. And if the calling app/page doesn’t use an AMP Cache, then it is served directly from your domain and therefore whatever technology your domain supports (e.g. HTTP/1.1 or HTTP/2). You can of course give out the AMP Cache URL instead of your real one if you want.
You seem to suggest you have altered your page to replace all images and the like with references to the AMP cache - is that so? If so that sounds like a bad idea, as the cache is loaded from your site which now depends on the cache, which is loaded from your site, which is... etc.

Serve dedicated HTML page to Google crawler without changing the URL to make for dynamic content

My website is in javascript with dynamically generated content on top of a fixed HTML frame. To make Google aware of the content I use the _escaped_fragment_ trick and track on the server side when to serve fixed content instead of dynamic. It all works well for the sub pages as long as they are linked with #!, which is the case for all pages but the homepage.
I obviously want to keep the homepage without an ugly #! at the end of the URL.
So far the only solution I can think of is to serve the homepage with fixed content instead of Ajax generated one for everyone.
I would rather keep the Google dedicated version branch separate from the common version as I don't maintain it as much, especially in terms of CSS and navigation, which do not matter that much.
Is there a way to figure out that it is Google crawling the website and serve a static version instead?
The solution is to add the meta tag:
<meta name="fragment" content="!">
More details there.

cache manifest uncache

Now that I've successfully cached my web page, how do I uncache it after making a change?
My user can't dl the latest version, even after I've changed a comment in my cache.manifest file.
My server is an IIS server.
The thing with caching is, well, stuff gets cached. Browsers won't, in general, try to download anything you've told them to cache until the cached items expire.
If you set everything to cache for a certain time span, the browser won't try to download any of the cached items until the end of it, which includes the cache.manifest file itself, by the sound of it.
Typically, you don't want to cache the content of the website, because then that makes it hard to change. Instead, you want to cache the various pieces, like images, css, and javascript, that the various pages of your site need. If you do this right, you can get a huge benefit for your users, and still have control over those resources, since you can always link to a different version of a particular resource in the content of the pages.
That said, if you do need to cache some portions of your pages, you can use server-side caching to reuse portions that are expensive to put together.

Understanding how images are served and cached

So I'm wondering how browsers treat requests for images. I'm hoping to use a cdn for serving product images on my website. I'd also like to use the cdn for serving button images and images used in my css.
The problem with this is that I don't have control over the expires headers (Rackspace files is what I'm looking into).
See, say I have a large image file as a background on my home page. So the page is accessed often, but the image stays the same. Is the browser going to request this image every time?
Or should I just use a cdn for my product images?
caching is quite a broad subject. I suggest you start by reading about the different kinds of caching here http://www.mnot.net/cache_docs/#BROWSER and how caching works here http://www.web-caching.com/mnot_tutorial/how.html
Now, to answer your question: assuming the user has caching enabled and the cdn response headers are properly configured a user visiting your page multiple times will only request that background image once until the cache expires or those files are cleaned.
No, AFAIK you need necessarily to add the 'cache' header to your images to enable browser caching. This is a great tutorial about it.
Additionally you can read this article from Yahoo to get a very brief view of the topics.
Review specially these topics of the article:
Minimize HTTP Requests
Add an Expires or a Cache-Control Header
Use a Content Delivery Network
Hope it helps you

Resources