Google webmaster tools: sitemaps submitted every day (?) - sitemap

having sent normally the first time my sitemap.xml through webmaster tools, I notice every day submitted url's plots (beside indexed ones under optimisation->sitemaps menu) without doing anything from my own. I use drupal7 with sitemap module (http://drupal.org/project/xmlsitemap) and there's no automated tasks enabled.
Does it mean that url's are submitted "internally" by google every day? Or there's something wrong that I need to resolve?
Many thanks for help.

Google will remember any sitemaps you submit and their crawler will automatically download those and associated resources more or less whenever it feels like doing so. This is usually reflected in your Webmaster Tools. In all likelihood it'll even do so without you entering your sitemap on their website if your site gets linked to. Same goes for pretty much any other bot and crawler out in the wild.
No need to worry, everything is doing what it's supposed to. It's a Good Thing(tm) when Google crawls your site frequently :).

Related

Robots.txt in Magento Websites

I have recently started working with a company that has a Magento eCommerce website.
We spotted that the traffic dipped considerably in May, and also the ranking on Google.
When I started investigating i saw that the pages of the ES website were not appearing on Screaming Frog
Only the homepage showed and status said blocked by robots.txt
I said this to my developer and they said they would move the robot.txt file to the /pub folder.
but would that not mean the file was in two places.. would this be an issue?
The developer has gone ahead and done this, how long should it take to see is screaming frog is indexing the pages.
Any Magento developers that could help with advise on this?
Thanks
Neo
There is a documentation page for how to manage robots.txt with Magento 2.x.
And you can use this to allow all traffic to your site:
User-agent:*
Disallow:
Regarding the Googlebot crawl rate, here is some explanation on it.
According to Google, “crawling and indexing are processes which can take some time and which rely on many factors.” It’s estimated that it takes anywhere between a few days to four weeks before Googlebots index a new site. If you have an older website and doesn’t experience crawling, the design of the site may be a problem. Sometimes, sites are temporarily unavailable when Google attempted to crawl, so checking the crawl stats and looking at errors can help you make changes to get your site crawled. Google also has two different crawlbots: a desktop crawler to simulate a user on a desktop, and a mobile crawler to simulate a device search.

Google not indexed url

Hello i have personal site and about 1 month ago i rebuild the complete site. I sent a new sitemap.xml file and is not indexed yet, but im having 404 crawler errors with the old url.
Google said the sitemap is correct,so, any idea, i must do something, or just wait longer?
Is not really important because is just a personal site, but i`m just curious about what is that happening.
Sorry for my bad english, but im spanish and thanks in advance
this just takes time.
I experienced some speed-up in that progress by using other google services as places or analytics. but to answer your question:
If your sitemap has been detected correctly it will work but this might take some time.

AdSense on history.pushState enabled page

First off, I know this has been discussed over and over again. But let's take this as a "late 2012 edition" since things tend to change rapidly on the internet.
I have this web page which is a "classical" web page with full page refreshes. Every internal click produces new content. We can show AdSense ads this way without a problem.
Now I started looking into "ajaxifying" (PJAX) the whole page for performance reasons (I've actually made a prototype version and it works superbly). The whole thing works only on browsers that support history.pushState, and whenever a user clicks on a internal link a AJAX request is triggered that fetches only the content part of the page (everything between the header and footer) and replaces old content with it.
The end result is, that the user is presented with a brand new page (including the changed URL and what not) and only the mechanism for delivering the page has changed (full reload vs. AJAX). As far as google (and older browsers) is concerned this is still a regular page with regular links (progressive enhancement and all that).
And yet there isn't a way to display AdSense, what with the document.write's and AdSense's TOS ruining the party.
My question: is there a Google approved (I'm not interested in hacks that will get us banned) way to display AdSense ads on a page like this (and I haven't found it). Or if there isn't, does Google have any plans on supporting this in the future (again, I haven't found anything related to this).
update
After some more digging around I came across Google DFP, which seems to support async loading of adds. But, I'm not sure I can load AdSense ads through it dynamically without breaking the TOS. I'm 100% sure I can load other ads this way, but not for AdSense. Could somebody clear this up for me?
According to this page loading Adsense ads through DFP you are subject to the both the DFP and Adsense terms. So I guess if you are following the current Adsense terms you are not allowed to do what you are talking about... at the same time Google provides a rather easy method to do exactly what you want to do with DFP...
Its still a grey area...

Why use a Google Sitemap?

I've played around with Google Sitemaps on a couple sites. The lastmod, changefreq, and priority parameters are pretty cool in theory. But in practice I haven't seen these parameters affect much.
And most of my sites don't have a Google Sitemap and that has worked out fine. Google still crawls the site and finds all of my pages. The old meta robot and robots.txt mechanisms still work when you don't want a page (or directory) to be indexed. And I just leave every other page alone and as long as there's a link to it Google will find it.
So what reasons have you found to write a Google Sitemap? Is it worth it?
From the FAQ:
Sitemaps are particularly helpful if:
Your site has dynamic content.
Your site has pages that aren't easily
discovered by Googlebot during the
crawl process—for example, pages
featuring rich AJAX or images.
Your site is new and has few links to it.
(Googlebot crawls the web by
following links from one page to
another, so if your site isn't well
linked, it may be hard for us to
discover it.)
Your site has a large
archive of content pages that are not
well linked to each other, or are not
linked at all.
It also allows you to provide more granular information to Google about the relative importance of pages in your site and how often the spider should come back. And, as mentioned above, if Google deems your site important enough to show sublinks under in the search results, you can control what appears via sitemap.
I believe the "special links" in search results are generated from the google sitemap.
What do I mean by "special link"? Search for "apache", below the first result (Apache software foundation) there are two columns of links ("Apache Server", "Tomcat", "FAQ").
I guess it helps Google to prioritize their crawl? But in practice I was involved in a project where we used the gzip-ed large version of it where it helped massively. And AFAIK there is a nice integration with webmaster tools as well.
I am also curious about the topic, but does it cost anything to generate a sitemap?
In theory, anything that costs nothing and may have a potential gain, even if very small or very remote, can be defined as "worth it".
In addition, Google says: "Tell us about your pages with Sitemaps: which ones are the most important to you and how often they change. You can also let us know how you would like the URLs we index to appear." (Webmaster Tools)
I don't think that the bold statement above is possible with the traditional mechanisms that search engines use to discover URLs.

How do I Extend Blogengine.Net to collect statistics of visitors?

I love BlogEngine. But from what I can se it does not collect the standard information about the visitors I would like to see (referrer, browser-type and so on).
When I log in as Admin I have a menu item named "Referrer". I can choose a weekday and then I'll be presented with 1 or 2 rows with
"google.com 4 hits, "itmaskinen.se 6 hits" and so on, But that's not what I want to se, I want to se where my visitors come from, country, IP if possible, how many visitors and so on.
If someone of you are familiar with Blogengine.Net and can point me in the right direction to where I would put my own log-code or if you know any visitor-statistic-extension that can do it for me, I would be really happy to know. I prefer an extension, because if I make changes myself to BlogEngine it may break later updates I install.
Blogengine.Net is a blog software made in .Net found here: http://www.dotnetblogengine.net/
And yes, I prefer to take this question here rather then in the Blogengine.Net forum, you know why. ;)
(Anyone, feel free to edit my (bad) english in this post and after that delete this sentence)
This isn't an extension, but it's what I use to collect all my blogengine.net data and it should be upgrade safe.
When you log into the Blogengine.NET admin screens you can go to "Settings> Custome Code > Tracking Script", here you can put your http://www.google.com/analytics/ logging script. Google Analytics provides all the referrer, browser type, etc stuff you were wanting. And what's nice is you can then create additional accounts for other sites if you choose.
I use both Google Analytics and StatCounter to track visitor stats. I find that each one provides useful information that the other doesn't. And they're both free to a certain extent.
I place their javascript code int the site.master file of my custom BE.Net skin.
For Google Analytics I go a step further and pass the username of authenticated users as a custom variable. That way I can match users names up with the stats. To do this you can use the _setVar javascript method on the GA pageTracker like so:
<script type="text/javascript">
var pageTracker = _gat._getTracker("UA-129049-25");
var userDefinedValue = '<%= System.Web.Security.Membership.GetUser() != null ? System.Web.Security.Membership.GetUser().UserName : "" %>';
pageTracker._setVar(userDefinedValue);
pageTracker._trackPageview();
</script>
Anyone noticed that we miss all the hits coming from RSS readers? Syndication.axd does not run the analytics javascripts. So we miss the vast majority of viewers from the statistics. And we happily analyze that is just not impotant - ad-hoc visitors.
For the vast majority of cases, Google Analytics does just fine. It all depends on how much data you want. For example, if you want to keep note of IP addresses and resolve them to get domain names, and also highlight all visits to your blog from, say, your coworkers at the company where you work, you'd have to write some custom code yourself. However, it's all fairly primitive - these sorts of things are easily achievable using ASP.NET.
I set up gathering statistics on IIS web site of my BlogEngine instance and then analyze the logs using WebLog Expert - http://www.weblogexpert.com.
It is more reliable than google analytics, since I see really ALL requests that are coming to my IIS, no matter if this is a request to axd or to some static content. And, once I've found out that google was fooling me in the number of visits. After that I trust my IIS statistics much more than google.
There is a Widget which can be use to display Visits and Online Users Statistics.
You can find it from following links:
http://www.nuget.org/packages/Statistics/
http://www.itnerd.ir/post/2013/07/25/Visits-and-Online-Users-Statistics-widget-for-BlogEngine-2
but to see the instructions go to the second link.

Resources