Thanks, Eike. I brought your reply back to the list so it can benefit other people. I hope you don't mind. And you are right - I am not Google, unfortunately. Tony
-----Original Message----- From: Eike Frost [mailto:ei@kefro.st] Sent: Tuesday, January 30, 2007 2:28 PM To: Webmaster Subject: Re: [Wikitech-l] FW: Our IP was blocked by mistake
Hi,
note that I don't speak for Wikimedia.
On Tuesday 30 January 2007 18:41, Webmaster wrote:
Just something that came to my mind... Google caches the wikipedia pages just like we were doing. Are you blocking Google as well?
You are not Google. You are not a search engine. Even if you were, Google takes great care not to overwhelm the servers it is crawling, and Google does not "frame" Wikipedia in ads when caching it. You're going to be hard-pressed to find ads on http://google.com/search?q=cache:-L-LvI3XPXcJ:en.wikipedia.org/+wikipedia&am... for instance. So no, Google is not doing what you are doing.
Wikipedia's license allows you to use the content in that manner. This does not mean it also allows you to overwhelm and/or abuse the servers, especially since you are not the ones paying for them. The downloadable wikimedia database snapshots have been there for years for precisely the purpose you are using Wikipedia's content for.
So please use that instead of trying to argue that you should be allowed to crawl all of Wikipedia with an abusive ASP script where the benefit to Wikipedia is pretty much exactly nil (And I categorize the script as abusive purely by what has been said on the mailinglist). You're not paying for the content and are making plenty of money plastering ads all over it, the least you can do is do research on how to do that without harming Wikipedia's infrastructure and using the resources made available for precisely those purposes. If your website ever becomes as important as Google's then maybe you can ask Wikimedia to generate special files especially for you (like they do for Yahoo) of pay them to do so.
My $0.02, anyway. And the dumps are really nice to work with, too. I've used them in the past :-)
--Eike
wikitech-l@lists.wikimedia.org