googlebot crawling content that was loaded by ajax?

googlebot crawling content that was loaded by ajax? - ajax

in mywebsit, when users click on each subject(url:http://mywebsite/subject/id(1234567)). title and description were showed and content of this subject loaded by ajax.
I add this url (http://mywebsite/subject/id(1234567)) in Fetch as Google (in the google/webmaster) and click on Fetch an Render. Fetch and Render in Googlebot , show ajax content for Rendering tab (the result for "This is how Googlebot saw the page:" show content that was loaded by ajax) but not show in Fetch tab. In Fetch tab exist only source html of my page with out ajax content.
This mean that googlebot crawling will index ajax content of my website?

The answer was no: you have to use ajax crawling scheme as described https://developers.google.com/webmasters/ajax-crawling/
But yesterday Google changed the rules of the game:
http://googlewebmastercentral.blogspot.com/2015/10/deprecating-our-ajax-crawling-scheme.html
Google states "we are generally able to render and understand your web pages like modern browsers."

Related

What are full page reloads and Why did we need to do full page reloads without ajax?

I was reading up on ajax and how it empowers us to exchange data with a server behind the scenes and consequently avoid full page reloads. My confusion lies here, I don't really understand what full-page reloads mean. I think it's probably cause I've been working with ajax/react since the start I guess and have not really seen any webpage of mine fully reload when I access stuff from a database or an api.
It'd be great if someone could explain what they are and why did we need them before ajax?

A full page load is where the entire page is downloaded from the server. A page typically consists of several sections: header, footer, navigation, and content. In a classic web application without AJAX, a user clicks on a link to another page, and has to download the full page, even though only the main content is changing. The header, footer, and navigation all get downloaded again even though they don't change.
With AJAX there is the opportunity to only change the parts of the page that will change. When a user clicks on the link, JavaScript loads just the content for that link and inserts it into the current page. The header, footer, and navigation don't need to reload.
This introduces other problems that need attention.
When AJAX inserts new content into the page, the URL doesn't change. That makes it difficult for users to bookmark or link to specific content. Well written AJAX applications use history.pushState() to update the URL when loading content via AJAX.
There are then two paths to get to every piece of content. Users can either load the URL containing that content directly, or load the content into some other page by following a link. Web developers need to test and ensure both work.
Search engines have trouble crawling AJAX powered sites. For best compatibility, you need to employ server side rendering (SSR) or pre-rendering to serve initial content on a page load that doesn't require JavaScript.
Even for Googlebot (which executes JavaScript) care must be taken to make an AJAX powered site crawlable. Googlebot doesn't simulate user actions like clicking, scrolling, hovering, or moving the mouse.
Content needs to appear on page load without any user interaction
You must use <a href=...> links for navigation so that Googlebot can find other pages by scanning the document object model (DOM). For users, JavaScript can intercept clicks on those links and prevent a full page load by using return false from the onclick handler or event.preventDefault() in the click handler.

"Fetch as Google" renders all pages to look like my homepage

I am trying to figure out why my website's posts and pages such as my resume are getting a "Complete" status with a green check mark (seemingly no errors or redirects) when fetching and rendering as google, but all of them "render" and look like my homepage. The page speed insights tool seems to be using the same rendering engine as it seems to have the same issue.
Notes:
The html served from my website on initial page load is the correct HTML and content. No redirects occur. The initial page load does not fetch content via JS. I mention this because although my website is not a one page application (I'm using Wordpress), I do use ajax in combination with a post variable flag to fetch new page content when the user navigates to the next page (after the initial page load).
I have verified that all of my pages have been indexed using the "site:" trick in Google search. They are indexed properly, but they aren't "rendering" properly.
Should I be worried? Should I just ignore that the pages aren't rendering properly? It doesn't make any sense. Is anyone else having this issue?

Your resume page has a response type of content-type image/gif so google thinks that the page is an image??

Google ajax crawling not working with fetch as google

I am trying to test with "fetch as google" an orchard website which has ajax content . Shouldn't google replace http://cmbbeta.azurewebsites.net/#! with http://cmbbeta.azurewebsites.net/?_escaped_fragment_ (both links work). When i hit my beta website with fetch as google, the preview shows me that the page is loading the ajax content,and not the static one.
Am i missing something?

The preview that appears when you put your mouse over the link always seem to show the dynamic website. The important thing to look at is the fetch result that you can access by clicking the "Success" link in the "Fetch Status" column.

This is probably not affecting your site, but the Fetch as Google feature doesn't work for AJAX urls that are specified with the <meta> tag. See here.

Google crawling, AJAX and HTML5

HTML5 allows us to update the current URL without refreshing the browser. I've created a small framework on top of HTML5 which allows me to leverage this transparently, so I can do all requests using AJAX while still having bookmarkable URLs without hashtags. So e.g. my navigation looks like this:
<ul>
<li>Home</li>
<li>News</li>
<li>...</li>
</ul>
When a user clicks on the News link, my framework in fact issues an AJAX GET request (jQuery) for the page and replaces the current content with the retrieved content. After that, the current URL is updated using HTML5's pushState(). However, it is still equally possible to just type http://www.example.com/news in the browser, in which case the content will be provided synchronously of course.
The question now is, will Google crawl the pages for this site? I know that Google provides a guide for crawling Ajax applications, but the article supposes that hashtags are used for bookmarkability, and I don't (want to) use hashtags.

Since you have actual hard links to the pages and they load the same content, Google will crawl your site just fine.

How does a website load only part of the page and still display full on URLs?

I am looking at the Gawker blogs (http://io9.com, http://lifehacker.com/) and I'm curious about how they are made.
When I click for on a link only the article part of the page reloads displaying a loading icon while it does.
But what I can't figure out is that links point to new URLs like io9.com/something/something and its not something like I see on ajax pages that they put a site.com/#something tag at the end of the url from javascript to mark the page after an ajax request.
Can I change the full blown URL from javascript or what is happening?

When it happens, the website is using the HTML5 History API. This API can change the url (via JavaScript) without changing the page.
See caniuse.com for browser support.
If you would like to implement it in yout website, backbonejs.org would be very useful.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio