On 1/17/06, Andrew Gray <shimgray(a)gmail.com> wrote:
It's logistically quite tricky to arrange matters
so the spiders
understand the difference between a talk page and a "real" page;
"allowing" isn't the key, it's "why don't we prevent it",
and the
answer is "if we tried it probably wouldn't work very well".
Not to stop anyone attempting something, but...
Wouldn't adding
Disallow: /wiki/Wikipedia:Articles_for_deletion
to
http://meta.wikimedia.org/robots.txt do the trick? I assume the
search engines will treat subpages as directories, as they are
separated by slashes.
Can robots.txt use wildcards? If it can, we could quite easily
restrict caching of the entire Wikipedia namespace, if we wanted (and
I doubt we would), using:
Disallow: /wiki/Wikipedia:*
--
Sam