How to disable AMP caching from Google Search? [closed] - caching

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 months ago.
Improve this question
Some results on Google Search comes with AMP (Accelerated Mobile Pages) icon on theirs links, at least when using a mobile, as soon you click on the link instead of loading the site, google show you a cached version of it rather.
I want to disable this behaviour on my results, I see at least two good reasons for it:
When sharing the link it is a pain in the neck to have the huge google URL in place of the shorter one would be just with the original one.
Security: when you access any site and see a URL other than the site you wanted to load, you should distrust it, even if it looks like google (remember, you can get phished or even get caught in a trap hosted on gsites), Google should respect that instead of encouraging users to trust it just because the url looks like google! Even worst if combined with the first reason and you want to share the URL with a friend.
I have to remove the google AMP prefix ever and ever, there is no advanced search option or cookie that makes Google give the clean URL?

According to the AMP project FAQ you cannot:
By using the AMP format, content producers are making the content in AMP files available to be cached by third parties.
As a content producer I dislike Google adding their own URL, and branding around my content... From the consumer perspective looks like the content comes from Google. They say it is to improve speed, but you can see Google's intention behind this "free" technology.

A simple hack is to keep using AMP guidelines for the speed it provides to the page, but violate one rule (like add you own javascript that does noting).
Once pages have an error, google will not cache them.

By publishing AMP pages you let Google or any other AMP cache store and deliver your web page (which surprisingly seems to be legal):
Caching is a core part of the AMP ecosystem. Publishing a valid AMP document automatically opts it into cache delivery. (https://www.ampproject.org/docs/fundamentals/how_cached)
To stop AMP from caching, the project recommends to invalidate the format by removing the amp attribute from the <html> tag. I propose something else.
One thing I always disliked about AMP ist that it requires you to embed the JavaScript code directly from their server (https://cdn.ampproject.org/v0.js), effectively telling AMP about every single visitor to every AMP page. Embedding the code from your own server stops this privacy issue, disables caching, and still gives you the framework.
To do so you can build your own AMP framework using the source code:
https://github.com/ampproject/amphtml
But it's much simpler to just copy v0.js and all the scripts it fetches to your own server.

Odd because google says to remove the "amp" from the tag to not cache.
It said nothing about loading the js locally.
https://amp.dev/documentation/guides-and-tutorials/learn/amp-caches-and-cors/how_amp_pages_are_cached/
Is google wrong?

Related

AJAX Crawling with question mark instead of hashbang

Where I'm at: I've read Google's documentation regarding it's AJAX crawling, and I've searched around a bit in this website and others, but I'm quite confused, as it seems that all problems address the same issue: AJAX crawing with hashbangs?
I've developed an app which, among other purposes, let's the user search for locations worldwide, using an AJAX searcher quite similar to Google's, but my app uses exclusively the question mark in AJAX, instead of hashbang. Due to compatibility issues, changing it to the hashbang is not an option.
Not only am I largely confused by the fact that I could not find anyone else using the question mark instead of the hashbang, I'm also wondering if there is any documentation regarding my issue: how to let google bot crawl all my AJAX content when I'm using the question mark instead of a hashbang in my AJAX app.
The AJAX crawling schema was created explicitly for applications and websites using hashbang (#!) in the URL structure, because the fragment part of the URLs only exist on the client side; the URL rewriting in the specs, i.e. from #! to ?_escaped_fragment_= is meant to solve that.
Since most of the web is already making use of Javascript in a way or other, we (Google) needed a better solution, so we started executing Javascript in the pages we crawled and effectively render every page, just like a normal browser would. To quote our blogpost, Understanding web pages better:
In order to solve this problem, we decided to try to understand pages by executing JavaScript. It’s hard to do that at the scale of the current web, but we decided that it’s worth it. We have been gradually improving how we do this for some time. In the past few months, our indexing system has been rendering a substantial number of web pages more like an average user’s browser with JavaScript turned on.
You can also see what we "see" using Fetch as Google in Search Console (former Webmaster Tools); read more about the feature in our post titled Rendering pages with Fetch as Google
Before you do anything else, please try to fetch a few pages from your site with Fetch as Google. You might not have to do anything at all, it might actually work out of the box. And the good news is that it's not only Google that's rendering pages!

General web page loading speed and performance best practices [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
What are some general (not specific to LAMP, .NET, Ruby, mySql, etc) tactics and best practices to improve page loading speed?
I am looking for tips about caching, HTTP headers, external file minification (CSS, JS), etc.
And for good tools like Google PageSpeed and Yahoo YSlow.
A "ultimate resource" wiki style checklist of "things not to forget" (moderated and updated by all the wizards here on SO) is the end goal. So folks don't have to Google around endlessly for outdated blog posts on the subject. ;)
I hope the "subjective" mods go easy on me, I know this is a bit open ended. And similar questions have been asked here before. And this material overlaps with the domain of ServerFault and Webmasters too a bit. But there is no central "wiki" question that really covers this so I am hoping to start one. There are great questions like this that I refer to on SO all the time! Thanks
Caching of page content
Load javascript at the bottom of the page
Minify css (and javascript)
Css and javascript should be in their own [external] files
If possible combine all js or css files into one of each type (saves server requests)
Use Google's jQuery and jQuery UI loaders (as it's likely already cached on some computers)
Gzip compression
Images should be the same size as the width and height as in the markup (avoid resizing)
Use image sprites when appropriate (but don't over do it)
Proper use of HTML elements ie. using <H#> tags for headers
Avoid div-itis or the more popular now ul-itis)
Focus javascript selectors as much as possible ie. $('h1.title') is much quicker than $('.title')
Make your dynamic content more static.
If you can render your public pages as static contents you'll help proxy, caches, reverse proxy, things like web application accelerators & DDOS preventing infrastructures.
This can be done in several ways. By handling the cache headers of course, but you can even think about real static pages with ajax queries to feed dynamic content, and certainly a mix between these two solutions, using the cache headers to make your main pages static for hours for most browsers and reverse proxys.
The static with ajax solution as a major drawback, SEO, bots will not see your dynamic content. You need a way to feed bots with this dynamic data (and a way to handle user accessing this data from search engines url, big hell). So the anti pattern is to have the real important SEO data in a static page, not in ajax dynamic content, and to limit ajax fancy user interactions to the user experience. But the user experience on a composite page can maybe be more dynamic than the search egine bots experience. I mean replace the latest new every hours for bots, every minute for users.
You need as well to prevent premature usage of session cookies. Most proxy cache will avoid caching any HTTP request containing a cookie (and this is the official specification of HTTP). The problem with this is often application having the login form on all pages, and which need an existing session on the POST of the login form. This can be fixed by separate login pages, or advanced redirects on the login POST. cookie handling in reverse proxy cache can as well be handled in modern proxy cache like Varnish with some configuration settings.
edit: One advanced usage of reverse proxy page can be really useful: ESI, for example with varnish-esi. You can put on your html render tags that the ESI reverse proxy will identify. ach of these identified regions can have different TTL -Time To Live- (let's say 1 day for the whole page, 10 min for a latest new block, 0 for the chat block). And the reverse proxy will make the requests in is own cache or to your backend to fill these blocks.
Since the web exists handling proxys and caches has always been the main technique to fool the user, thinking the web was fast.

Pros and Cons of an all Ajax Site? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 10 months ago.
Improve this question
So I actually saw a full ajax site somewhere (I forget where) and thought it would be something new and fun to try. I used an old site I had built and put it on a new server. With a little bit of jquery and ajax, I was able to make the entire site work on one page load.
My question is, what are some pros and (more likely) cons to this method?
Please note - the site works through a semi clever linking function. Everything works perfectly fine if the user doesn't have javascript enabled, the newly requested page loads like it would on any other website.
More detail -- Say the user loads the homepage of the site, then logs in. When they log in, the login box fades and reappears with user info. Other content on the page loads as necessary upon logging in. If they click a link, lets say "Articles", one column on the homepage slides up and slides back down with the articles. If they click home the articles slide up and the homepage content slides back down. Things like posting comments, viewing profiles, voting on things, etc. are all done through ajax.
Is this a bad method of web design? If so, why?
I am open to all answers/opinions.
IMO, this isn't "bad" or "good". That depends completely on whether or not the website fulfills the requirements. Oftentimes, developers working on AJAX-only sites tend to miss the whole negative SEO impact issue. However, if the site is developed to support progressive enhancement (or graceful degradation depending on your point of view), which it sounds like you have, then you're good. Only things to prepare for are times when the AJAX call can't complete as expected (make sure you're dealing with timeouts, broken links, etc) so the user doesn't get stuck staring at a loading icon. (The kind of stuff you'd have to deal with in any application, really.)
There are plenty of single-page websites out there using heavy JS and AJAX for the UI and they are great. Specifically, I know of portfolio sites for web designers and web app development teams that use this approach. Oftentimes, the app feels a bit like a flash app, but without the need for a special plugin.
"Is this a bad method of web design? If so, why?"
Certainly not. In fact, making web-pages behave more like desktop applications, whilst remaining functional to ALL users, is the holy-grail of web-design.
I say, as long as you consider ALL your users, i.e. mobile/text-only/low bandwidth/small screensizes then you will be fine. Too many developers just do it for their huge 19" screens and 10Mbps, that users to get left behind through almost no fault of their own.
It depends on the user
This relates closely to UX, IMHO, though of course it's on-topic for programming solutions.
All-AJAX is often called "managing state" 12 years after this Question was asked.
From my experience in:
Creating a platform for API plugins
Creating two of my own CMS web apps for different purposes
Managing many different WordPress.org sites for different purposes
Managing my own cloud servers for both PHP-AJAX and Node.js doing these calls
...it depends on what is most efficient for users.
Consider these scenarios:
Will users be clicking around this website all day long or for at least an hour adjusting many different options and <form> inputs?
Or will many users visit briefly to perform just a handful of quick tasks?
State-managed / all-AJAX is by far best for scenario 1, with Facebook and Gmail as prime examples.
Whole-page loads are more efficient for scenario 2, like blogs, especially with pages linked directly from search results. That might apply to webstores like Amazon, maybe, where users search Google to find one or two products, then leave.
Philosophically, I've heard that the difference is about the number of users and traffic, but I don't quite agree. It's more about how much clicking and <form> sending the primary target user will be doing.

The role and scope of Ajax in modern websites. Finding the right balance [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 12 years ago.
My friend and I are building a website together, and he is insistent that page refreshes are a thing of the past and that we should build the whole website in AJAX. His only reason why page refreshes are 'annoying' is that they are too slow.
However, the page is running fine without AJAX currently and when I click from page to page, it seems instantaneous to me. It doesn't seem that it would benefit from additional speed, but he just says I'm being stubborn.
I do want to use AJAX for certain features and pages within the site. I feel like I understand the pros and cons. He references that gmail is made in AJAX, but the url changes as I go into different mailboxes, so I don't think it is 100% AJAX.
I reference wikipedia, which is actually much more similar to our site, as an example of a highly succesful website which doesn't seem to NEED AJAX. But he says that's just one example, and that I am fixated on on wikipedia.
Some personal rant:
1. When I tell him that AJAX is great, but that most of the internet will still require page refreshes and page links, he thinks I'm crazy.
2. When I tell him that using AJAX when not needed will make the back-button useless, he tells me that I'm obsessed with the back button.
3. I think AJAX is something that can be added later to make functionality smoother on certain features, but that it is OK to build the core of the website without it for now.
What is your opinion on the matter? When is ajax really needed in a website?
Thanks
No, Ajax is not necessary for a website to be great. But it can improve usability if used correctly and not overused.
A site built entirely with Ajax and non-functional with JavaScript disabled is a nightmare. No navigation back/forward. No bookmarking. Not to mention its effects for SEO, that is the site will be invisible to search engines.
The golden rule: build the site in classic fashion then add little elements of Ajax to improve usability now and then.
For certain advanced functions it might be okay to be only available as Ajax, but try to make sure the most of the site is at least accessible in read mode when JavaScript is disabled. StackOverflow is a great example of that approach.
My rule of thumb is - what are you building: a website or a web application?
if its a website, content should just NOT be loaded via ajax. It breaks many assumptions that the end users have about the website. Other problems:
1. SEO
2. back button breaks
3. more work to do on your side to make the website UI consistent
4. placing relevant ads will be more complicated
An excellent example is wikipedia.
If its a webapp, then ajax can really help:
1. you can design better user interaction
2. the user will actually expect the app to behave like a rich app, and not like a website.
3. you can dramatically increase the responsiveness of the app using ajax.
I hope that helps.
Of course, AJAX is not necessary to build a great website. It can, however, improve the user experience in certain situations. It is necessary to carefully study and understand your requirements, the structure of your site, and the navigation your users will undertake.
One important thing to consider, however, is bookmarks. Using AJAX extensively makes it extremely difficult to be able to simply bookmark a certain spot or "state" of your website.
Sorry to not post on topic, but I agree with the answers other posted (AJAX in great if not used too much.) It also DEPENDS on the website, if it's more like a web app where you don't need SEO and bookmarks (like gmail) you can go with full ajax (try GWT), if it's content rich- go with just a little AJAX.
But what I wanted to underline is the relation with your friend: you have to be careful when you start a big website with somebody else. If your fight is too big for such a detail you'll have much more problems later.
Get a website that supports a lot of connections, see how they do things and you might understand where and when ajax is used. Start looking at StackOverflow for instance.
This entire site is serving 16 million
pages a month and we are doing it off
of 2 servers, which are almost
completely unloaded. The Microsoft
stack is a pretty good stack.
Joel Spolsky, StackOverflow.com
Form validation using Ajax is the way to go.
I hate clicking "Submit" only to have the page return in a few seconds saying my password is not strong enough or that user ID is already in use. It should be instantaneous while I'm filling in the fields!
As far as StackOverflow, I think it's great how when I click on "Show additional comments", I see the spinner and then they immediately appear. When I change the sort on the answers by "newest" say, I hate how the page refreshes.
I don't think you need Ajax for a site to be great. That said, more sites that are great make use of Ajax. Good RIAs are awesome.
I dont see a lot of ajax on Digg, ArsTechnica, LifeHacker, and the such. Those are all (subjectively) pretty good sites.
No, you don't need it. It just needs to perform well for what your intended audience needs.
Yeah I think you do need it, having to submit pages is so pre-millenium.
More seriously, if you are presenting data, I really think it improves the user experience if asynchronous calls to server are used and the returned data displayed without the need for a complete page refresh.
I remember the first time I saw it used (years back) I was extremely impressed, awed even.
Anyone got example of dynamic, data driven websites that look great and don't use ajax ?
AJAX is not an absolute NEED for a website application. It does not necessary mean that your page will be faster. A lot more things determine page speed, such as:
client side minifying (css and js)
image compression and sprites
server location
and much, much more
Of course, applying AJAX in some strategic point of your website will be where you will most benefit from it. Use it where there is likely to be a lot of activity from your users. I personally always make my website without any http requests handling at first, and then implement the rest by adding AJAX where there is the much concern.
I think your friend is being a bit too concerned about AJAX. Like everything in life, it always tastes better with moderation.
One potential downside of AJAX, when misused, is that content can't be bookmarked.
Try and follow the rule of thumb, that the user should be able to link to content by copying the URL from the address bar. There are several ways to achieve this, with traditional page loads being one.
AJAX is not a must for any website. But if your website has voting, save as favorite, or add to cart, etc ajax will definitely add value.

Services to suggest changes in markup to improve search-engine friendliness (think W3Cs validation services)

Are there any services out there, that can parse a website and give some sort of feedback to how search-engine friendly that website is? And perhaps even suggest changes to the mark-up to improve indexing?
Think W3Cs validation services.
Try Google Webmaster Tools. After you add your site, it will often list "problems" with your site, such as duplicate title tags and meta descriptions, and also things like 404 errors.
If you are a GoDaddy customer you can validate web-crawler friendliness on your hosted sites.
Tools to automate alteration of markup code for any objective are a horribly frightful proposition. Simply write your code correctly the first time. If you have archaic code then it likely has many other problems in addition to SEO, and automatically imposing global changes can expose problems you may not be prepared to address.

Resources