Brion Vibber wrote:
On mar, 2003-01-07 at 05:25, Erik Moeller wrote: Upon noticing that the main pages and mailing list archives _are_ indexed, I have my suspicions about our robots.txt file; the line:
Disallow: /w
perhaps should be:
Disallow: /w/
The former may be accidentally blocking /wiki/<arcticle-name> paths -- which of course form the bulk of our content! -- in addition to the scripted pages via direct access to the /w subdirectory that it's intended to block.
I have updated the robots.txt file; if indeed this is how the googlebot was interpreting the line, I hope we can be respidered soon...
-- brion vibber (brion@pobox.com / brion@wikipedia.org)
Alas, I think you're reight. The robots exclusion standard suggests a simple substring comparison be used in implementations, and all their examples of directory exclusion use the slash.
-- Neil