There are several advantages to the HTML5 Pushstate in comparison to hasbangs, in fact, Google is now encouraging the use of Pushstate. The only Pushstate disadvantage being publicly discussed is the fact that non-modern browsers do not support it. However, to me it seems that Pushstate is also disadvantageous when it comes to caching. I might be wrong, hence this question.
is Pushstate inferior to Hashbangs when it comes to caching pages?
Here is a case where it seems that Pushstate is bad at caching.
Pushsate
Bob navigates to eg.com/page1, the full page is downloaded, rendered and cached.
Bob clicks a button, eg.com/json/page2 is downloaded and cached.
The browser Processes the JSON and re-renders parts of Bob's page.
Pushstate changes the displayed browser address to eg.com/page2.
Bob closes the browser, then re-opens it and directly visits
eg.com/pushstate2. The full page is downloaded, rendered and cached.*
*-Despite the fact that it is already theoretically available in the cache under the guise of eg.com/json/page2
Hashbangs
Alice navigates to eg.com/#!page1, eg.com/index.html is downloaded and cached.
eg.com/json/page1 is downloaded and cached.
The browser Processes the JSON and renders Alice's page.
Alice clicks a button, eg.com/json/page2 is downloaded and cached, the displayed browser address is changed to eg.com/#!page2
The browser Processes the JSON and renders Alice's page.
Alice closes the browser, then re-opens it and directly visits eg.com/#!page2. NOTHING is downloaded and everything is loaded from cache, unlike Pushstate.
Summary
I have numerous similar cases in mind, The question is whether or not this is indeed valid, I may be missing something which is leading me to wrong conclusions. is Pushstate inferior to Hashbangs when it comes to caching pages?
I think that pushstate is inferior, but if you are building a SPA page correctly the differences should not be significant:
Assuming that you are using one of the latest frameworks, your index.html page should be relatively small with a few <script> tags (frameworks like webpack, systemjs etc).
The js files that are referenced with these tags do get cached normally so the only difference between the two methods is fetching index.html for every pushstate url as opposed to fetching it once in hashbang mode.
I got the idea from the following question:
https://webmasters.stackexchange.com/questions/65694/is-this-way-of-using-pushstate-seo-friendly
Related
I was reading up on ajax and how it empowers us to exchange data with a server behind the scenes and consequently avoid full page reloads. My confusion lies here, I don't really understand what full-page reloads mean. I think it's probably cause I've been working with ajax/react since the start I guess and have not really seen any webpage of mine fully reload when I access stuff from a database or an api.
It'd be great if someone could explain what they are and why did we need them before ajax?
A full page load is where the entire page is downloaded from the server. A page typically consists of several sections: header, footer, navigation, and content. In a classic web application without AJAX, a user clicks on a link to another page, and has to download the full page, even though only the main content is changing. The header, footer, and navigation all get downloaded again even though they don't change.
With AJAX there is the opportunity to only change the parts of the page that will change. When a user clicks on the link, JavaScript loads just the content for that link and inserts it into the current page. The header, footer, and navigation don't need to reload.
This introduces other problems that need attention.
When AJAX inserts new content into the page, the URL doesn't change. That makes it difficult for users to bookmark or link to specific content. Well written AJAX applications use history.pushState() to update the URL when loading content via AJAX.
There are then two paths to get to every piece of content. Users can either load the URL containing that content directly, or load the content into some other page by following a link. Web developers need to test and ensure both work.
Search engines have trouble crawling AJAX powered sites. For best compatibility, you need to employ server side rendering (SSR) or pre-rendering to serve initial content on a page load that doesn't require JavaScript.
Even for Googlebot (which executes JavaScript) care must be taken to make an AJAX powered site crawlable. Googlebot doesn't simulate user actions like clicking, scrolling, hovering, or moving the mouse.
Content needs to appear on page load without any user interaction
You must use <a href=...> links for navigation so that Googlebot can find other pages by scanning the document object model (DOM). For users, JavaScript can intercept clicks on those links and prevent a full page load by using return false from the onclick handler or event.preventDefault() in the click handler.
I saw this question asked here 18 months ago, but without (a correct) answer: Window like facebook chat
Both Facebook and OkCupid have messaging windows which stay open even when you click to another page on their website. Literally the IM window (and friend list, on Facebook) don't so much as flash or "blink" as if they were reloading quickly. If you refresh the website (F5 or such) then the messages will disappear, at least for a moment.
The only thing I can think of is that the entire website never actually changes addresses, but just pushes the new URLs to your browser so it looks like the URL changed, but you never really left the same file.
How are they offering this persistent chat?
My guess is they are using something similar to qjuery-pjax:
https://github.com/defunkt/jquery-pjax
From their docs:
pjax works by grabbing html from your server via ajax and replacing the content of a container on your page with the ajax'd html. It then updates the browser's current url using pushState without reloading your page's layout or any resources (js, css), giving the appearance of a fast, full page load. But really it's just ajax and pushState.
This means clicking a link on the page will load only part for page and leave the chat windows untouched (no flicker). If you hit F5, the browser is initiating the refresh which will not use ajax/pushState. This causes the chat windows to flicker.
I have tried to set my site up ( http://www.diablo3values.com )according to the guidelines set out here : https://developers.google.com/webmasters/ajax-crawling/ However, it appears that Google has updated their indexes (because I see the revisions to the meta description tags) but the ajax content does not show up in the index.
I am trying to use the “Handle pages without hash fragments” option.
If you view either of the following:
http://www.diablo3values.com/?_escaped_fragment_=
http://www.diablo3values.com/about?_escaped_fragment_=
you will correctly see the HTML snap shot with my content. (those are the two pages I an most concerned about).
Any Ideas? Am I doing something wrong? How do you get google to correclty recognize the tag.
I'm typing this as an answer, since it got a little to long to be a comment.
First of all, your links seems to point to localhost:8080/about, and not /about, which probably is why google doesn't index it in the first place.
Second, here's my experience with pushstate urls and Google AJAX crawling:
My experience is that ajax crawling with pushstate urls is handled a little differently by google than with hashbang urls. Since google won't know that your url is a pushstate url (since it looks just like a regular url), you need to add <meta name="fragment" content="!"> to all your pages, not only the "root" page. And google doesn't seem to know that the pages are part of the same application, so it treats every page as a separate Ajax application. So the Google bot will never actually create a navigation structure inside _escaped_fragment_, like _escaped_fragment_=/about, as it would with a hashbang url (#!/about). Instead, it will request /about?_escaped_fragment_= (which you aparently already have set up). This goes for all your "deep links". Instead of /?_escaped_fragment_=/thelink, google will always request /thelink?_escaped_fragment_=.
But as said initially, the reason it doesn't work for you is probably because you have localhost:8080 urls in your _escaped_fragment_ generated html.
Googlebot only knows to crawl the escaped fragment if your urls conform to the hash bang standard. As users navigate your site, your urls need to be:
http://www.diablo3values.com/
http://www.diablo3values.com/#!contact
http://www.diablo3values.com/#!about
Googlebot actually needs to see these urls in the source code so that it can follow them. Then it knows to download the following urls:
http://www.diablo3values.com/?_escaped_fragment=contact
http://www.diablo3values.com/?_escaped_fragment=about
On your site you appear to be loading a new page on each click, and then loading the content of each page via AJAX too. This is not how I would expect an AJAX site to work. Usually the purpose of using AJAX is so that the user never has to load a whole new page. When the user clicks, the new content section is loaded and inserted into the page. You serve the navigation once and then you only serve escaped fragments of the content.
I am using Opera and sometimes a page keeps on loading even though all content has already been presented. How do I find out which elements are to be loaded or what causes the ongoing loading process?
Even though all content seems to be 'presented', the page may still be loading images, JavaScript, CSS, or other resources referenced by it. This process performed by the browser isn't refereed to as "AJAX" as you have tagged in your question. AJAX is the asynchronous invocation of JavaScript to retrieve or submit data without requiring page refreshes.
As for examining which resources are causing your page to appear to be still "loading"...
I use Firebug's network tab to look at pending requests for resources in Firefox. It shows every resource your browser requests, how long it takes to retrieve, and the entire request & response headers and body.
Google chrome has something similar built-in, just hit F12 to bring up the "Developer Tools"
I would assume Opera has something similar although I am not sure of it's name.
I'm 10000000% sure that this question has been asked before, however, the majority of the responses that I came across were from back in 2005, 2006 and so on. Not to mention, almost all of the questions themselves were too general. Therefore, I'm asking this so that for anyone else needs to find this out, then they won't need to dig through about 50 webpages to get an idea.
My question is simply that I have a webpage that has Google Ads embedded into the HTML of the website. The website was first developed as a static HTML site where each link reloaded a new page. Nevermind the backend technology of the website - the website itself produces purely dynamic content. The website is close to completion and now a fully-ajax listener has been added to all the links. When any of the links are clicked, JavaScript takes over, parses the link and sets that using popstate or the hashbang. The page itself is then queried to the server via AJAX and the content is updated using document.getElementByID('container').innerHTML=ajax.responseText; This way, there is almost a 100% method of accessing content that was replaced by AJAX.
This all works fine, but the responseText itself may, WILL contain Google Ads, and I was just wondering how to display them as if it were a static page. Clearly this doesn't work. Here are the options that I've come across:
Use an IFrame:
An IFrame seems to be an effective way to load the content; just stick the adsense codes a simple adsense.html iframe file and let the browser and
directly into page, it isn't possible
it's against their TOS
there is document.write() omitted in ajax request
Your chance is:
Create simple iframe
<iframe src="advert.html"></iframe>
and in advert.html, add your advert code
It's then loaded fine without problems.
Good luck