Paul Ebermann wrote:
The second step (which can occur some weeks later)
is taking URLs from their database, and retrieve
the page. When they are excluded in the respective
robots.txt or by a meta-noindex, they are deleted
from the database.
I'm trying to understand the causes of this problem, because I don't
want it to happen to me. So far it hasn't happened to me, and it has
been a mystery to me why it happens to Wikipedia. For susning.nu, I
use "meta noindex" only (no robots.txt), and I never see any edit
links on Google.
What you just wrote made me think that performance might be the key to
the problem: Perhaps the Wikipedia server was slow, timed out, or down
when Googlebot tried to retrieve the edit URL, so Google never
understood that this page was "noindex". Given Wikipedia's large
number of pages and sluggish performance, this could easily happen to
some pages. Not all, but enough many to become a problem.
The only real difference that I can think of between susning.nu and
Wikipedia is that my site's performance has always been pretty good.
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik -
http://aronsson.se/