Re: [WikiEN-l] Copyright Violation Bot

21 Dec 2006


      Neil Harris wrote:
...
geni wrote:
[snip]
...
Certain searches of existing content would be useful the most obvious 
being running a copy of the database against a copy of britianica.
And other databases of copyrighted texts, such as InfoTrac 
(http://www.gale.com/onefile/) or similar, and things like Google Book 
Search.
-- Neil
Just a thought: the en: Wikipedia gets about 3 edits a second. I wonder 
if it would be possible for us to use special pleading through the 
Foundation to get a dedicated search pipe into Google that would allow 
us to do, say, 30 searches a second 24 hours a day, (which would only be 
a tiny, tiny fraction of their overall capacity), in recognition of the 
_very_ substantial benefit in advertising revenue they must surely 
currently be receiving as a side effect of having Wikipedia's content 
online to draw in search queries.
(Think about it: even if only 20% of Wikimedia's 4000 or so page loads a 
second come from Google users who are expecting something like Wikipedia 
content, and Google only make $0.25 CPM on serving page ads on searches 
for those pages, that comes to an income stream of $0.20 per _second_ 
from Wikipedia searches, or a total of about $8M a year...)
If so, we could integrate the copyright violation bot into the 
toolserver, or into the MW server cluster itself.
-- Neil

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [WikiEN-l] Copyright Violation Bot