On 10/27/07, Anthony wikimail@inbox.org wrote:
[[Wikipedia:Database download#Please do not use a web crawler]]
Have Google and Yahoo been informed of this policy?
Context: "Please do not use a web crawler to download large numbers of articles."
As in "Don't use a web crawler to get big amounts of data for your own personal use" (i.e. for mirroring). And it's quite valid, if lots of people downloaded the entire site one article at a time, we'd end up with big problems - especially seeing as the load would be evenly distributed across many articles, and hence there'd be a lot of extra parsing happening.
Google and Yahoo have nothing to do with this, as search engines would represent a tiny portion of our requests (whereas many users doing a lot of requesting would not), and use the data obtained for the public benefit.