I (and the other sysops on WikipediaNL) recently got a message from Ilse, which is an important Dutch search engine (actually, the largest one that is specifically directed towards Dutch content). They wanted to put nl.wikipedia.org in their search engine, but had problems with the 15 second crawl-delay specified in our robots.txt. And indeed, when we're talking about the bot of a search engine spidering Wikipedia, a delay of 15 seconds between visits seems rather exessive. If a search engine would like to crawl the English Wikipedia that way (and we would like to be in the various search engines, don't we?), it would cost them about 2 months of 24-hour days to do so, probably more (this was on a rather conservative estimate on the number of pages). To me this sounds like too much. I would thus want to ask to consider a considerable reduction of the crawl-delay. Alternatively, we could do so for a 'whitelist' of trusted User agents.
Andre Engels
wikitech-l@lists.wikimedia.org