Google does not index my images - using sitemap, multi-lang subdomains and static subdomain - sitemap

Most of my images cannot be found in the Google Image Search.
I have submitted Google Sitemaps. There are no problems reported on Search Console, but only 1 image out of 34 is indexed. I suspect my multi-language setup could be a problem.
I have a website with serves output in different languages. For each language I have a subdomain: de.openisles.org and en.openisles.org.
For each of the language domains I have a sitemap, for example with the language-dependent text.
My sitemap entries look like this:
<!-- de.openisles.org/sitemap.xml -->
<url>
<loc>http://de.openisles.org/media/screenshots/2016-01-03-demanded-goods.html</loc>
<image:image>
<image:loc>http://static.openisles.org/media/screenshots/2016-01-03-demanded-goods.png</image:loc>
<image:caption>Infopanel: verlangte Güter</image:caption>
</image:image>
</url>
<!-- en.openisles.org/sitemap.xml -->
<url>
<loc>http://en.openisles.org/media/screenshots/2016-01-03-demanded-goods.html</loc>
<image:image>
<image:loc>http://static.openisles.org/media/screenshots/2016-01-03-demanded-goods.png</image:loc>
<image:caption>Info panel: Demanded goods</image:caption>
</image:image>
</url>
The two websites link each other, so that Google knows it's the same content in another language.
<link rel="alternate" hreflang="de" href="http://de.openisles.org/media/screenshots/2016-01-03-demanded-goods.html" />
<link rel="alternate" hreflang="en" href="http://en.openisles.org/media/screenshots/2016-01-03-demanded-goods.html" />
Because images are not language-dependent (I do not want them to be) I have an additional subdomain static.openisles.org. To tell Google that my static server belongs to me, I added this subdomain also in the Search Console.
My question is simple: What am I doing wrong? Why is Google not indexing my images?

It's entirely possible that nothing is wrong with your sitemap, especially if Google Search Console doesn't say anything.
Google has its own algorithm for what to index and what not to index, and submitting a sitemap does not guarantee indexing; it only helps Google to more fully map out your website, if it decides to crawl it.
I've submitted a sitemap for a library with contained 4,000,000 urls, but its been close to a month now and Google's only indexed around 14,000.
I think the fact that even one of your images has been indexed is a good sign - Google was able to find it! Have patience, my friend, and I think you'll find the other images will slowly get indexed as well.
Best of luck!

Related

Google Cache and multi lingual domain names

I've developed a site which is available via two top level domain names. Both the language on the site is Dutch, one for the Dutch visitors and one for the Belgian visitors.
The .be version of the was recently "launched". Under the hood it's the same site ofcourse and we're using a meta tag to prevent getting penalized for duplicate content. (Google's support page)
So; there's this page: www.domain.nl|be/vakantie/oostenrijk/tirol/
And depending on the TLD this is the implemented meta tag:
// Dutch site visitors
<link rel="alternate" hreflang="nl-NL" href="http://www.bergenmeer.nl/vakantie/oostenrijk/"/>
// Belgium site visitors
<link rel="alternate" hreflang="nl-BE" href="http://www.bergenmeer.be/vakantie/oostenrijk/"/>
The Belgian version is live since about 6 weeks. Both sites are equiped with a sitemap listing the URLs for that domain. But we're seeing the following in Google Cache.
The live version of this page (see URL, phone number on the top right.
The cached version of this page (see URL, phone number on the top right.
When you load this page (despite some performance issues, we're looking into that) and you inspect the network traffic you'll see the page opens with a HTTP 200 response. No redirects whatsoever. Why is Google not showing the Belgian version of the page?
Thanks for the time you take to share your thoughts.
Ben
For .be you could have
<link rel="canonical" href="http://www.bergenmeer.be/vakantie/oostenrijk/"/>
<link rel="alternate" hreflang="nl-NL" href="http://www.bergenmeer.nl/vakantie/oostenrijk/"/>
and for .nl you could have
<link rel="canonical" href="http://www.bergenmeer.nl/vakantie/oostenrijk/"/>
<link rel="alternate" hreflang="nl-BE" href="http://www.bergenmeer.be/vakantie/oostenrijk/"/>
Giving Google a hint at what you want prioritised and therefore to make it into the cache as it appears to only be using the alternate.

#! url showing up at the top of my search results

We are just getting started with SEO/Ajax so hoping someone can help us figure this out - One of the #! urls is showing up as the first organic result for our startup nurturelist.com. Although this link technically works, we would 1) not like to have any #! urls show up in search results because they look weird and we have non #! versions 2) the second organic result in the image is the one that we'd actually like to appear at the top.
Thanks very much on any thoughts on how we can make this happen...
Do you just simply not want the #! to show up in search results? Simply make a robots.txt in your root directory (in most cases the public_html directory) and add these lines to it:
User-agent: *
Disallow: /\#!/
This prevents Google from indexing all pages under the /#!/ subdirectory.
However:
If the page has already been indexed by Googlebot, using a robots.txt
file won't remove it from the index. You'll either have to use the
Google Webmaster Tools URL removal tool after you apply the
robots.txt, or instead you can add a noindex command to the page via a
tag or X-Robots-Tag in the HTTP Headers.
(Source)
Here is a link to the Google Webmaster Tools URL Removal Tool
So add this to pages you don't want indexed:
<meta name="ROBOTS" content="NOINDEX, NOFOLLOW" />

how to submit specifically website to Google.fr, Google.de

situation is :
one website (based on Magento ecommerce solution), different storeviews, all accessible through the same domain but an extension is then redirecting to the correct storeview based on customer location.
I have one storeview for Germany, one for USA, and a fallback worldwide, the first is in EUR and tax included, the second and third are in USD and tax excluded.
I submit my product price with the structured data scheme (itemprop).
I have one sitemap for each storeview and submit them all to Google.
The problem : In Germany, when I google my product, I got my URL from my worldwide storeview (which is not the killer as my extension will redirect afterwards) but with the USD price.
How to do so that I submit my sitemap from the germany storeview to Google.de and not my worldwide storeview.
Use ccTLD
The best way to target different countries is to use ccTLD.
This is what they're made for and Google use it to determine the targeted location of your website.
Configure Google Webmaster Tools
In the Google webmaster tools, you can set your geographic target to a website.
As said on Google Help Center:
Set a geographic target:
On the Webmaster Tools Home page, click the site you want.
Click the gear icon (top right corner), and then click Site Settings.
In the Geographic target section, select the option you want.
If you want to ensure that your site is not associated with any country or region, select Unlisted in the drop-down list.
Use Link rel alternate hreflang
You can use the declaration of different language version to target countries in the <head>of your pages.
This use the <link> tag like this :
<link rel="alternate" href="http://example.com/fr" hreflang="fr-FR" />
<link rel="alternate" href="http://example.com/de" hreflang="de-DE" />
<link rel="alternate" href="http://example.com/us" hreflang="en-US” />
<link rel="alternate" href="http://example.com/" hreflang="x-default" />
You need to declare every version of each URL on every pages. Exemple above needs to be on every URL of the example.
Read further on Google Help Center about telling Google your different localized target.

Master Sitemap link in Header of site or Robots.txt

I have a master sitemap that contains links to other site maps that is accessable on a path like:
www.website.com/sitemap.xml
I wanted to ask if this is enough for the search engines or if I need to link this to my site?
linking - I know I can use a robots.txt file but I is it possible to just add a link to the head of the site - something like (and I'm just guessing):
<head>
<link rel="sitemap" type="application/xml" title="Sitemap" href="/sitemap.xml">
</head>
thankyou
Adam
This is totally okay.
Sitemap should always be located in the root and that is the only place where the search engines will look.
I suggest you to use a Google Webmasters tool to submit a sitemap for your domain so you can get indexed and you can monitor search engine behavior.
Hopefully this info will help you.

Multiple Sitemap: entries in robots.txt?

I have been searching around using Google but I can't find an answer to this question.
A robots.txt file can contain the following line:
Sitemap: http://www.mysite.com/sitemapindex.xml
but is it possible to specify multiple sitemap index files in the robots.txt and have the search engines recognize that and crawl ALL of the sitemaps referenced in each sitemap index file? For example, will this work:
Sitemap: http://www.mysite.com/sitemapindex1.xml
Sitemap: http://www.mysite.com/sitemapindex2.xml
Sitemap: http://www.mysite.com/sitemapindex3.xml
Yes it is possible to have more than one sitemap-index-file:
You can have more than one Sitemap index file.
Highlight by me.
Yes it is possible to list multiple sitemap-files within robots.txt, see as well in the sitemap.org site:
You can specify more than one Sitemap file per robots.txt file.
Sitemap: http://www.example.com/sitemap-host1.xml
Sitemap: http://www.example.com/sitemap-host2.xml
Highlight by me, this can not be misread I'd say, so simply spoken, this can be done.
This is also necessary for cross-submits, for which btw. the robots.txt has been chosen.
Btw Google, Yahoo and Bing, all are members of sitemaps.org:
Sitemap 0.90 is offered under the terms of the Attribution-ShareAlike Creative Commons License and has wide adoption, including support from Google, Yahoo!, and Microsoft.
So you can rest assured that your sitemap entries will be properly read by the search engine bots.
Submitting them via webmaster tools can not hurt either - as John Mueller commented.
If your sitemap is over 10 MB (uncompressed) or has more than 50 000 entries Google requires that you use multiple sitemaps bundled with a Sitemap Index File.
Using Sitemap index files (to group multiple sitemap files)
In your robots.txt point to a sitemap index which should look like this:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://www.example.com/sitemap1.xml.gz</loc>
<lastmod>2012-10-01T18:23:17+00:00</lastmod>
</sitemap>
<sitemap>
<loc>http://www.example.com/sitemap2.xml.gz</loc>
<lastmod>2012-01-01</lastmod>
</sitemap>
</sitemapindex>
It's recommended to create a sitemap index file, rather separate XML URLs to put in your your robots.txt file.
Then, put the indexed sitemap URL as below in your robots.txt file.
Sitemap: http://www.yoursite.com/sitemap_index.xml
If you want to learn how to create indexed sitemap URL, then follow this guide from sitemap.org
Best Practice:
Create image sitemap, video sitemap separately if your website has huge number of such contents.
Check spelling of robots file, it should be robots.txt, don't use robot.txt or any misspelling.
Put robots.txt file in root directly only.
For more info, you can visit robots.txt's official website.
You need specify in your in your file sitemap.xml this code:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://www.exemple.com/sitemap1.xml.gz</loc>
</sitemap>
<sitemap>
<loc>http://www.exemple.com/sitemap2.xml.gz</loc>
</sitemap>
</sitemapindex>
source: https://support.google.com/webmasters/answer/75712?hl=fr#
It is possible to write them, but it is up to the search engine to know what to do with it. I suspect many search engines will either "keep digesting" more and more tokens, or alternatively, take the last sitemap they find as the real one.
I propose that the question be "if I want ____ search engine to index my site, would I be able to define multiple sitemaps?"

Resources