* Aryeh Gregor <Simetrical+wikilist(a)gmail.com> [Tue, 25 Aug 2009
13:13:56 -0400]:
That's wrong. The canonical version of a page
must be a page with
substantially identical content. Edit pages serve totally different
HTML; rel=canonical pointing to the article will just be ignored by
search engines. See here for a discussion of how rel=canonical works:
http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.h…
Thanks for pointing out.
> Note, e.g., "We allow slight differences, e.g., in the sort order of a
> table of products. We also recognize that we may crawl the canonical
> and the duplicate pages at different points in time, so we may
> occasionally see different versions of your content." Totally
> different content, no.
Well, semantically an edit page and action=view
page are not totally
different, for sure. Both of these will contain very similar
information. But I cannot go against standards, that's impossible.
That's something like law, you don't always like it, but you have to
obey it.
> Anyway, it seems that Yandex crawler doesn't
like the meta noindex
> rules in the header of the page, giving an error (warning) message
in
> > the stats of their webmaster tools.
> What does the warning say? Ideally, of
course, you should ban them in
> robots.txt, so the search engine doesn't have to bother fetching the
> URL.
I've banned them in robots.txt It produces
the warning due to
non-existing titles, which also have meta noindex. There are some links
from foreign sites to non-existing titles which I obviously cannot
disable something like "http://mywiki.org/wiki/nonexsitingtitle" .
Yandex gives the warning "Document contains meta-tag noindex"
(approximately translated from Russian). A lots of such warnings. A bit
strange, why this is a warning at all. Google doesn't give such warning.
> The purpose is to tell search engines which URL you'd prefer them to
> present to users, if the same content is being served under multiple
> URLs. It is not meant to artificially inflate rankings by counting
> unindexed pages as contributing to some entirely different page of
> your choosing, and using it that way won't actually work. Since
> search engines were already using heuristics to identify duplicate
> content, and might well continue to use those exact same heuristics to
> validate rel=canonical, it might not improve rankings at all.
I am not so sure that such inflation is
artifical. The artifical one
would be when the article/revision is not the same or, even mixing
MediaWiki generated HTML and other HTML. But, anyway I cannot change how
the search engines will interpret it.
Dmitriy