Should sitemap must have main page url as well? - sitemap

I know sitemap should have all urls associated with the site, which eventually helps google to crawl.
My question is, Is it wise to have main url as well in the sitemap OR it doesnot have any meaning?

Sitemaps do not have a 'main page' concept, so it does not have any meaning.
But, by all means, including the URL to what you consider to be your main page in your sitemap always helps.

Related

Is there any way to get backlink from disqus commenting system.?

I have some question suppose if I comment on other blog which don't accept html url .Is there any chance that I will get backlinks for my site if I edit my disqus profile and enter my url.
In short how can I get backlinks for disqus . Withought sapmming.
There is no way to get direct backlinks through using the disqus comment system other than pasting them in the comment content.
As you mentioned you can add your url to your Disqus profile page and each comment you make will link to that page (but with a no-follow link) however the url on the profile is also no-followed so it's hardly a worthwhile way of building backlinks... but it's an extremely good way of getting involved in communities and networking so use it for doing that instead.
In short: You cannot indirectly build backlinks to your chosen url with Disqus.
If you mean to get a backlink using an anchor text, its simply not possible.
I have read few posts and it said that you get backlink from wordpress site with disqus and the backlink are of nofollow type.. But with other sites having disqus google does not index the comments made on disqus. So you don't get a link.

Can we identify googlebot like search engines hit on particular URL

My Problem:
My client site which displays more products and it adds more page load/weight. So i decided to use ajax more products loading and it works well. But here it affects the seo - and no products or deals has been indexed(Even i suggest the client to submit product via googlebase but client doesnot like that idea and he wants direct google crawling into site also he wants less time page load).
Question:
Can we identify the googlebot crawling request to the server or mozila like browser user agent request to the site(server).
Suggestion I have
I tried to identify user agent from requests but that doesnot working(or i might missing something?) Please anyone have correct solution for this problem to reduce the page load time using ajax and get googlebot also to crawl the website.
You should just search stackoverflow for "Google AJAX SEO". There are a number of questions around this.
In short, Google has a specification to make AJAX sites crawlable: https://developers.google.com/webmasters/ajax-crawling/docs/getting-started?hl=sv-SE
You can also look into PushState as an SEO option as well.
One tactic that is used to solve this is to harness the pagination function of whatever framework or CMS you are using. You load one page of content and display pagination links in your view then use JavaScript to hide the pagination links and fetch the content of the linked pagination page via Ajax and append it to the current page. Take a look at how infinite-scroll works for inspiration:
http://www.infinite-scroll.com/
Basically you need to be at least loading links to pages that contain the other content so that search engines can crawl the content, but you can hide these links for the users who have JavaScript Enabled.
But to better answer your question, it is possible to redirect robots using htaccess:
redirect all bots using htaccess apache
But it is better SEO, as far as I understand it, to have the content or links to it, actually available on the page.

Is my AJAX content already crawlable?

I have build a site based on Ajax navigation.
I have build it that way, that whenever someone without javascript visits my site, the nav links, which usually load content via Ajax, are acting like normal links and the user can browse through the pages as usual.
Since, Google bot doesn't run javascript, it should theoretically be able to go through all links and corresponding sites as usual, right? Since they are valid links with the href tag pointed to the corresponding site.
Now I was wondering if thats sufficient or if I need to implant this method from Google too to make sure Google sees all my content?
Thanks for your insights and excuse my poor English!
If you can navigate your site by showing source (ctrl-u in chrome), google can also crawl your site. Yes, its that simple

Redirect AJAX page requests to canonical links with .htaccess

I'm coding a site that makes heavy use of AJAX to load pages for users with JavaScript, but I also want it to be friendly for users with JavaScript disabled or unavailable. I've covered all the basics; for example, all my links point to canonical links, and JavaScript loads them via AJAX. My "about" page, therefore, is located at /about/, but will load on the main page and will, once finished, utilize hash/hashbang links to enable back-button functionality.
Here's the problem I have: while a hash/hashbang link will be able to be used to link to a specific page via AJAX for users with JavaScript, if a user with JavaScript attempts to link someone without it to the page, the page cannot be loaded for that person using AJAX.
As such, I'd like to be able, if possible, to use .htaccess to redirect hash/hashbang-specified pages to the canonical link. In other words, the exact opposite of what this contributer was trying to achieve.
http://example.com/#!about --> http://example.com/about/
Is it possible with .htaccess, or otherwise without JavaScript? If so, how?
Thanks!
I don't think it's possible to do this on server side. Because the part of the url after # is not included in the request sent to the server.
I might be a bit late to the party on this one, but i'm looking into this too. Since your url already contains the #!, as opposed to #, you can actually do this. Google will fetch
http://example.com/#!about
as
http://example.com?_escaped_fragment_about
Therefore, if you use a redirect 301 on that, and use javascript to redirect the user only version of the page, you have practically reached your desired result.
I realise you asked for a no-javascript solution, but i figure that was for reasons of SEO. For more information, please see this page by google.
EDIT:
<meta http-equiv="refresh" content="5; url=http://example.com/">
Some more on meta refresh here.
It:
1) Does not require javascript!
-
2) Can be Seo friendly!
-
3) Works with bookmarks and history (etc.)
I hope this helps!

Facebook like or share with dynamic document title

I found this problem all over the net but no answer yet, so maybe here someone solved it ...?
I built a page relying heavily on jquery.address. It's got one index page and the rest loads dynamically via Ajax following Google's /#!/ scheme for crawlable pages. Now I want to add Facebooks Like or share button but I can't get it to grab the actual page title or url.
Whatever I do, it always falls back to title and url of the index page. It tried:
(obviously) changing title an openGraph meta on load of the new parts.
"linking" the crawler page (?_escaped_fragmet_=xyx) but specifying the #! page in meta
"sharing" with a given title and url.
I never get anything but a link to the index page or a blank "share" to the right url with title and thumbnail ignored.
Has anyone got a similar setup working?
Thanks for any hints,
thomas
Facebook is actually using #! now and it works! If you build your site so that http://site.de/?_escaped_fragment=something is identical to http://site.de/#!/something all you have to do is "share" the #! url and it'll display the info from the escaped fragment page.
Use this URL to check: http://developers.facebook.com/tools/debug
But: A much cleaner solution to the problem can be found here: http://github.com/browserstate/history.js/wiki/Intelligent-State-Handling
My guess would be that Facebook's crawler doesn't run Javascript and will always display whatever's actually in the page it gets from the server.
Facebook share has a BRUTAL cache, last time I checked it was impossible to change the title / description data once it was scraped :(
The issue I had was the og:url and the actual url of the page did not match. I also read a number of comments about the og data being just after the title element, but I don't think that solved anything.
With regard to issues of caching, it is true that Facebook's caching is "brutal", but it does not cache anything for the lint tool: http://developers.facebook.com/tools/debug.
I use no-hash-bang urls when sharing links. I process the hard links and redirect them to a hash bang client side using javascript. That way if a crawler goes to the hard linked page it will display the information just as it would if javascript were enabled.
Compare:
http://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Flikeapage.com%2F%23!%2FChristmas%2Fvs%2FBacon
and
http://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Flikeapage.com%2FChristmas%2Fvs%2FBacon
Hope this helps.

Resources