* Aryeh Gregor Simetrical+wikilist@gmail.com [Tue, 25 Aug 2009 13:13:56 -0400]:
That's wrong. The canonical version of a page must be a page with substantially identical content. Edit pages serve totally different HTML; rel=canonical pointing to the article will just be ignored by search engines. See here for a discussion of how rel=canonical works:
http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.ht...
Thanks for pointing out.
Note, e.g., "We allow slight differences, e.g., in the sort order of a table of products. We also recognize that we may crawl the canonical and the duplicate pages at different points in time, so we may occasionally see different versions of your content." Totally different content, no.
Well, semantically an edit page and action=view page are not totally different, for sure. Both of these will contain very similar information. But I cannot go against standards, that's impossible. That's something like law, you don't always like it, but you have to obey it.
Anyway, it seems that Yandex crawler doesn't like the meta noindex rules in the header of the page, giving an error (warning) message
in
the stats of their webmaster tools.
What does the warning say? Ideally, of course, you should ban them in robots.txt, so the search engine doesn't have to bother fetching the URL.
I've banned them in robots.txt It produces the warning due to non-existing titles, which also have meta noindex. There are some links from foreign sites to non-existing titles which I obviously cannot disable something like "http://mywiki.org/wiki/nonexsitingtitle" . Yandex gives the warning "Document contains meta-tag noindex" (approximately translated from Russian). A lots of such warnings. A bit strange, why this is a warning at all. Google doesn't give such warning.
The purpose is to tell search engines which URL you'd prefer them to present to users, if the same content is being served under multiple URLs. It is not meant to artificially inflate rankings by counting unindexed pages as contributing to some entirely different page of your choosing, and using it that way won't actually work. Since search engines were already using heuristics to identify duplicate content, and might well continue to use those exact same heuristics to validate rel=canonical, it might not improve rankings at all.
I am not so sure that such inflation is artifical. The artifical one would be when the article/revision is not the same or, even mixing MediaWiki generated HTML and other HTML. But, anyway I cannot change how the search engines will interpret it. Dmitriy