First I think I understand what cloaking is, but what is it in detail?
My problem: I've a webapp created using wavemaker, so it's full of javascript and ajax calls. Therefore the google crawler can't see any of my content. My idea is now to make a different simple html page for users have javascript disabled and for the google crawler. This page contains a javascript block and a redirect like this:
<script language="javascript" type="text/javascript">
//redirect to the ajax page
window.location.href = 'http://www.myhomepage.com/index.html?page=about';
</script>
The redirect will only occur when a user browses this site and have javascript turned on. The google crawler will never be redirected. Both pages have the same content, but different URLs. Do you think this technic is cloaking?
I think all the points that were raised on Flash based web sites apply to this scenario.
You have 1 web site that uses technologies the Search engine crawlers can not read (fully \ correctly).
here is what Matt Cutts said:
"A good rule of thumb is to take a look at your site in a text browser like Links or an ancient browser with JavaScript/CSS/Flash turned off. If you can reach all your pages just by clicking regular links, your site should be pretty crawlable."
http://www.mattcutts.com/blog/solved-another-common-site-review-problem/
based on that, and other articles:
if your code will show Search engine crawlers the same content - i do not think this is cloaking
Even simpler:
You could make use of the <noscript> Tag, hence delivering content to users (and the google bot) that have javascript turned off. No need for an ugly redirect ...
Just use it like this:
<noscript>Your content for Javascript disabled browsers and bots here</noscript>
It's usually referred to as a black hat SEO trick.
However it can have other uses, generally however a good server application shouldn't need to use tricks like that.
I think this explains it better than me though ...
http://en.wikipedia.org/wiki/Cloaking
Clocking is a SEO unethical technique which not follow SEO rules and harmful for your site. It's also call black hat SEO Technic. In the technique the content presented to the search engine spider is different from that presented to the user's browser.
Related
I have developed a website using angularjs and web api.
The problem is that the ajax rendered content is not crawable by google. And no one can find the website using google search.
After reading many articles regarding this issue, including:
This one with all links of explanation going out,
Google ajax crawling protocol, and also stack over flow question, I couldn't find the proper solution. Those that mention asp.net solutions, are talking about mvc, and I need only the simple REST by web api, other articles are not talking about asp.net.
Is there any simple explanation?
I'm the one who asked this same question long ago, so I will answer from my experience:
Firstly, if all your content are accessible via unique URIs (including the hashbang if you use it), modern search engines should index it just fine. In fact Google can index javascript generated content now. You can try that via the Google Webmaster tool and see how your site is indexed.
Secondly, there are libraries that help you to serve parsed content to search engines if you need to, but in my case I didn't bother much with it since Google is indexing js nicely.
I've seen others ask this question, and maybe I'm missing something or this is outdated, but I don't see why AngularJS needs to be an issue with SEO.
Say you have a landing page and it has a bunch of links. Assuming you're using html5 mode in AngularJS (and I'm not sure that's 100% necessary) and something like ng-route then the links on the landing page can work both as "angular" (JavaScript) links and "old school" (full page load) links.
If you're a human user you can click a link and it will do angular magic and adjust the content without loading the full page. Ok, all fine.
But if you instead copy the link and paste it in a new tab or new browser, it will still work - assuming you've set up routes correctly.
I'm not an SEO expert by any stretch of the imagination, but as I understand it, having links that load pages and having those pages have real and useful content is the core of SEO, and done this way, AngularJS should work fine. The key thing to check is if you copy and paste the link (not just click it) that it works.
I've got a very unique situation that I don't believe any of the other topics here can relate.
I have a ecommerce module that is dynamically loaded / embedded into third party sites, no iframe straight JSON to web client into content. I have no access to these third part sites at all, other then my javascript file being loaded from their page and dynamically generating the content.
I'm aware of the #! method, but that's no good here, my JS does generate "urls" within the embedded platform, but they're fake and for the address bar only, and I don't believe google crawlers can reach this far.
So my question is, is there a meta that we can set to point outside the url to i.e. back to my server with static crawlable content. I.e. pointing the canonical to my server... but again I don't think that would work.
If you implement #! then you have to make sure the url your embedded in supports the fragment parameter versions, which you probably can't. It's server side stuff.
You probably can't influence the canonical tag of the page either. It again has to be done server side. Any meta tag you set via JavaScript will not be seen by a bot.
Disqus solved the problem by providing an API so the embedding websites could get there comments server side and render then in plain html. WordPress has a plugin to do this. Disqus are also one of the few systems that Google has worked out how to crawl their AJAX pages.
Some plugins request people to also include a plain link with the JavaScript. Be careful with this as you may break Google Guidelines if you do it wrong. But you may be able to integrate the plain link with your plugin so that it directs bots and users to a crawlable version of the content.
Look into Google's crawlable ajax standard (and why it's a bad idea) and canonical URLs.
Now you can actually do this. A complete guide and examples can be found here: https://github.com/kubrickology/Logical-escaped_fragment
I'm coding a site that makes heavy use of AJAX to load pages for users with JavaScript, but I also want it to be friendly for users with JavaScript disabled or unavailable. I've covered all the basics; for example, all my links point to canonical links, and JavaScript loads them via AJAX. My "about" page, therefore, is located at /about/, but will load on the main page and will, once finished, utilize hash/hashbang links to enable back-button functionality.
Here's the problem I have: while a hash/hashbang link will be able to be used to link to a specific page via AJAX for users with JavaScript, if a user with JavaScript attempts to link someone without it to the page, the page cannot be loaded for that person using AJAX.
As such, I'd like to be able, if possible, to use .htaccess to redirect hash/hashbang-specified pages to the canonical link. In other words, the exact opposite of what this contributer was trying to achieve.
http://example.com/#!about --> http://example.com/about/
Is it possible with .htaccess, or otherwise without JavaScript? If so, how?
Thanks!
I don't think it's possible to do this on server side. Because the part of the url after # is not included in the request sent to the server.
I might be a bit late to the party on this one, but i'm looking into this too. Since your url already contains the #!, as opposed to #, you can actually do this. Google will fetch
http://example.com/#!about
as
http://example.com?_escaped_fragment_about
Therefore, if you use a redirect 301 on that, and use javascript to redirect the user only version of the page, you have practically reached your desired result.
I realise you asked for a no-javascript solution, but i figure that was for reasons of SEO. For more information, please see this page by google.
EDIT:
<meta http-equiv="refresh" content="5; url=http://example.com/">
Some more on meta refresh here.
It:
1) Does not require javascript!
-
2) Can be Seo friendly!
-
3) Works with bookmarks and history (etc.)
I hope this helps!
i am planing to start a full ajax site project, and i was wondering about SEO.
The site will have urls like www.mysite.gr/#/category1 etc
Can Google crawl the site.
Is something that i have to noticed about full ajax and SEO
Any reading suggestions are welcome
Thanks
https://stackoverflow.com/questions/768233/do-hashes-in-urls-affect-seo
You might want to read about so called progressive enhancement.
Google supports indexing of AJAX sites, but unfortunately it involves extra work for the developer. See http://code.google.com/web/ajaxcrawling/docs/getting-started.html
I don't think Google is capable of doing so (yet)
http://googlewebmastercentral.blogspot.com/2009/10/proposal-for-making-ajax-crawlable.html
However you can of course make your site usable with or without JavaScript. That way, browsers will have the full candy stuff and Google (and text browsers) still can navigation your site.
In addition to SEO, you also need to think about usability standards here. A site that is that reliant on AJAX isn't going to work for things like screen-readers as well as spiders. You need a system for graceful degreadation. A website that can't function without JavaScript isn't really a functioning website.
The search engines will spider the initial page load - what happens to the page (with ajax) after that is irrelevant to listings.
Google itself doesn't crawl ajax content but advice a mechanism for it. For this you first need to change # to #!
Whole process to SEO AJAX content is explained here along with simple asp.net code to start working on it.
Imagine having to hit the “refresh” button in your browser to update your Twitter feed rather than just hitting the button on the page itself and having it instantly update? These are the types of problems that AJAX solves, although it does come with its pitfalls. Google might claim it’s able to crawl and parse AJAX websites, yet it’s risky to just take its word for it and leave your website’s organic traffic up to chance. Even though Google can usually index dynamic AJAX content, it’s not always that simple. This guide covers some of the things that can go wrong and how you can make sure your AJAX website is crawlable: https://prerender.io/ajax-seo/
We're coming from GWT projects and because of problems with SEO not liking GWT for our next project we're going to move clear of GWT (mainly because seo is a high priority for this next project). In choosing a new framework, I'm looking at Wicket and liking what I've seen so far. I've only done a few tutorials, but in looking at the war layout (from these tutorials) it looks like most of the html pages are in the WEB-INF folder.
It this going to cause problems for SEO and search engines crawling through the sites files?
Ideally, I'd like to use Wicket with some AJAX and deploy to Google App Engine.
It does not matter if your .jsps (or whatever) are stored in /WEB-INF. It just means they cannot be accessed directly by going to http://webapp/path/to/jsp.
For SEO think about:
Meaningful URLs and link text (i.e. URLs should be similar to expected search engine queries)
Crawlable pages (make sure all your content can be reached by a non-JS enabled bot... i.e. don't make content only available through AJAX, for instance). A sitemap might help
Look into Wicket's Bookmarkable page links and UrlCodingStrategies for a very powerful combination to use in SEO. Basicly all your links and parameters can be encoded as/a/static/url, regardless of (changing) implementation on the backend.
if you project SEO is really important than you might reconsider using a lot of ajax since crawler wont execute javascript they are not gonna read all the return of your ajax calls... that being said the SEO quality of your site is not really based on the framework you will be using ... jsut always think about img alts, links, meta, title, h1 ... in every pages and you should be fine ... also always try to post links to your site on other websites to gain visibility and get importance for crawlers